Molecules for disease detection and treatment

ABSTRACT

The invention provides human molecules for disease detection and treatment 8MSST) and polynucleotides which identify and encode MDDT. The invention also provides expression vectors, host cells, antibodies, agonists, and antagonists. The invention also provides methods for diagnosing, treating, or preventing disorders associated with aberrant expression of MDDT.

TECHNICAL FIELD

[0001] This invention relates to nucleic acid and amino acid sequencesof molecules for disease detection and treatment and to the use of thesesequences in the diagnosis, treatment, and prevention of cellproliferative, autoimmune/inflammatory, developmental, and neurologicaldisorders, and in the assessment of the effects of exogenous compoundson the expression of nucleic acid and amino acid sequences of moleculesfor disease detection and treatment.

BACKGROUND OF THE INVENTION

[0002] It is estimated that only 2% of mammalian DNA encodes proteins,and only a small fraction of the genes that encode proteins are actuallyexpressed in a particular cell at any time. The various types of cellsin a multicellular organism differ dramatically both in structure andfunction, and the identity of a particular cell is conferred by itsunique pattern of gene expression. In addition, different cell typesexpress overlapping but distinctive sets of genes throughoutdevelopment. Cell growth and proliferation, cell differentiation, theimmune response, apoptosis, and other processes that contribute toorganismal development and survival are governed by regulation of geneexpression. Appropriate gene regulation also ensures that cells functionefficiently by expressing only those genes whose functions are requiredat a given time. Factors that influence gene expression includeextracellular signals that mediate cell-cell communication andcoordinate the activities of different cell types. Gene expression isregulated at the level of DNA and RNA transcription, and at the level ofmRNA translation.

[0003] Aberrant expression or mutations in genes and their products maycause, or increase susceptibility to, a variety of human diseases suchas cancer and other cell proliferative disorders. The identification ofthese genes and their products is the basis of an ever-expanding effortto find markers for early detection of diseases and targets for theirprevention and treatment. For example, cancer represents a type of cellproliferative disorder that affects nearly every tissue in the body. Thedevelopment of cancer, or oncogenesis, is often correlated with theconversion of a normal gene into a cancer-causing gene, or oncogene,through abnormal expression or mutation. Oncoproteins, the products ofoncogenes, include a variety of molecules that influence cellproliferation, such as growth factors, growth factor receptors,intracellular signal transducers, nuclear transcription factors, andcell-cycle control proteins. In contrast, tumor-suppressor genes areinvolved in inhibiting cell proliferation. Mutations which reduce orabrogate the function of tumor-suppressor genes result in aberrant cellproliferation and cancer. Thus a wide variety of genes and theirproducts have been found that are associated with cell proliferativedisorders such as cancer, but many more may exist that are yet to bediscovered.

[0004] DNA-based arrays can provide an efficient, high-throughput methodto examine gene expression and genetic variability. For example, SNPs,or single nucleotide polymorphisms, are the most common type of humangenetic variation. DNA-based arrays can dramatically accelerate thediscovery of SNPs in hundreds and even thousands of genes. Likewise,such arrays can be used for SNP genotyping in which DNA samples fromindividuals or populations are assayed for the presence of selectedSNPs. These approaches will ultimately lead to the systematicidentification of all genetic variations in the human genome and thecorrelation of certain genetic variations with disease susceptibility,responsiveness to drug treatments, and other medically relevantinformation. (See, for example, Wang, D. G. et al. (1998) Science280:1077-1082.)

[0005] DNA-based array technology is especially important for the rapidanalysis of global gene expression patterns. For example, geneticpredisposition, disease, or therapeutic treatment may directly orindirectly affect the expression of a large number of genes in a giventissue. In this case, it is useful to develop a profile, or transcriptimage, of all the genes that are expressed and the levels at which theyare expressed in that particular tissue. A profile generated from anindividual or population affected with a certain disease or undergoing aparticular therapy may be compared with a profile generated from acontrol individual or population. Such analysis does not requireknowledge of gene function, as the expression profiles can be subjectedto mathematical analyses which simply treat each gene as a marker.Furthermore, gene expression profiles may help dissect biologicalpathways by identifying all the genes expressed, for example, at acertain developmental stage, in a particular tissue, or in response todisease or treatment. (See, for example, Lander, E. S. et al. (1996)Science 274:536-539.)

[0006] Certain genes are known to be associated with diseases because oftheir chromosomal location, such as the genes in the myotonic dystrophy(DM) regions of mouse and human. The mutation underlying DM has beenlocalized to a gene encoding the DM-kinase protein, but another activegene, DMR-N9, is in close proximity to the DM-kinase gene (Jansen, G. etal. (1992) Nat. Genet. 1:261-266). DMR-N9 encodes a 650 amino acidprotein that contains WD repeats, motifs found in cell signalingproteins. DMR-N9 is expressed in all neural tissues and in the testis,suggesting a role for DMR-N9 in the manifestation of mental andtesticular symptoms in severe cases of DM (Jansen, G. et al. (1995) Hum.Mol. Genet. 4:843-852).

[0007] Other genes are identified based upon their expression patternsor association with disease syndromes. For example, autoantibodies tosubcellular organelles are found in patients with systemic rheumaticdiseases. A recently identified protein, golgin-67, belongs to a familyof Golgi autoantigens having alpha-helical coiled-coil domains(Eystathioy, T. et al. (2000) J. Autoimmun. 14:179-187). The Stac genewas identified as a brain specific, developmentally regulated gene. TheStac protein contains an SH3 domain, and is thought to be involved inneuron-specific signal transduction (Suzuki, H. et al. (1996) Biochem.Biophys. Res. Commun. 229:902-909).

[0008] The discovery of new molecules for disease detection andtreatment, and the polynucleotides encoding them, satisfies a need inthe art by providing new compositions which are useful in the diagnosis,prevention, and treatment of cell proliferative,autoimmune/inflammatory, developmental, and neurological disorders, andin the assessment of the effects of exogenous compounds on theexpression of nucleic acid and amino acid sequences of molecules fordisease detection and treatment.

SUMMARY OF THE INVENTION

[0009] The invention features purified polypeptides, molecules fordisease detection and treatment, referred to collectively as “MDDT” andindividually as “MDDT-1,” “MDDT-2,” “MDDT-3,” “MDDT-4,” “MDDT-5,”“MDDT-6,” “MDDT-7,” “MDDT-8,” “MDDT-9,” “MDDT-10,” “MDDT-11,” “MDDT-12,”“MDDT-13,” “MDDT-14,” “MDDT-15,” “MDDT-16,” “MDDT-17,” “MDDT-18,”“MDDT-19,” “MDDT-20,” “MDDT-21,” “MDDT-22,” “MDDT-23,” “MDDT-24,”“MDDT-25,” and “MDDT-26.” In one aspect, the invention provides anisolated polypeptide selected from the group consisting of a) apolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-26, b) a polypeptide comprising a naturallyoccurring amino acid sequence at least 90% identical to an amino acidsequence selected from the group consisting of SEQ ID NO:1-26, c) abiologically active fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO:1-26, and d) animmunogenic fragment of a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO:1-26. In onealternative, the invention provides an isolated polypeptide comprisingthe amino acid sequence of SEQ ID NO:1-26.

[0010] The invention further provides an isolated polynucleotideencoding a polypeptide selected from the group consisting of a) apolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-26, b) a polypeptide comprising a naturallyoccurring amino acid sequence at least 90% identical to an amino acidsequence selected from the group consisting of SEQ ID NO: 1-26, c) abiologically active fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO:1-26, and d) animmunogenic fragment of a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO:1-26. In onealternative, the polynucleotide encodes a polypeptide selected from thegroup consisting of SEQ ID NO:1-26. In another alternative, thepolynucleotide is selected from the group consisting of SEQ ID NO:27-52.

[0011] Additionally, the invention provides a recombinant polynucleotidecomprising a promoter sequence operably linked to a polynucleotideencoding a polypeptide selected from the group consisting of a) apolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-26, b) a polypeptide comprising a naturallyoccurring amino acid sequence at least 90% identical to an amino acidsequence selected from the group consisting of SEQ ID NO:1-26, c) abiologically active fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO:1-26, and d) animmunogenic fragment of a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO:1-26. In onealternative, the invention provides a cell transformed with therecombinant polynucleotide. In another alternative, the inventionprovides a transgenic organism comprising the recombinantpolynucleotide.

[0012] The invention also provides a method for producing a polypeptideselected from the group consisting of a) a polypeptide comprising anamino acid sequence selected from the group consisting of SEQ IDNO:1-26, b) a polypeptide comprising a naturally occurring amino acidsequence at least 90% identical to an amino acid sequence selected fromthe group consisting of SEQ ID NO:1-26, c) a biologically activefragment of a polypeptide having an amino acid sequence selected fromthe group consisting of SEQ ID NO:1-26, and d) an immunogenic fragmentof a polypeptide having an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-26. The method comprises a) culturing a cellunder conditions suitable for expression of the polypeptide, whereinsaid cell is transformed with a recombinant polynucleotide comprising apromoter sequence operably linked to a polynucleotide encoding thepolypeptide, and b) recovering the polypeptide so expressed.

[0013] Additionally, the invention provides an isolated antibody whichspecifically binds to a polypeptide selected from the group consistingof a) a polypeptide comprising an amino acid sequence selected from thegroup consisting of SEQ ID NO:1-26, b) a polypeptide comprising anaturally occurring amino acid sequence at least 90% identical to anamino acid sequence selected from the group consisting of SEQ ID NO:1-26, c) a biologically active fragment of a polypeptide having an aminoacid sequence selected from the group consisting of SEQ ID NO:1-26, andd) an immunogenic fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO:1-26.

[0014] The invention further provides an isolated polynucleotideselected from the group consisting of a) a polynucleotide comprising apolynucleotide sequence selected from the group consisting of SEQ IDNO:27-52, b) a polynucleotide comprising a naturally occurringpolynucleotide sequence at least 90% identical to a polynucleotidesequence selected from the group consisting of SEQ ID NO:27-52, c) apolynucleotide complementary to the polynucleotide of a), d) apolynucleotide complementary to the polynucleotide of b), and e) an RNAequivalent of a)-d). In one alternative, the polynucleotide comprises atleast 60 contiguous nucleotides.

[0015] Additionally, the invention provides a method for detecting atarget polynucleotide in a sample, said target polynucleotide having asequence of a polynucleotide selected from the group consisting of a) apolynucleotide comprising a polynucleotide sequence selected from thegroup consisting of SEQ ID NO:27-52, b) a polynucleotide comprising anaturally occurring polynucleotide sequence at least 90% identical to apolynucleotide sequence selected from the group consisting of SEQ IDNO:27-52, c) a polynucleotide complementary to the polynucleotide of a),d) a polynucleotide complementary to the polynucleotide of b), and e) anRNA equivalent of a)-d). The method comprises a) hybridizing the samplewith a probe comprising at least 20 contiguous nucleotides comprising asequence complementary to said target polynucleotide in the sample, andwhich probe specifically hybridizes to said target polynucleotide, underconditions whereby a hybridization complex is formed between said probeand said target polynucleotide or fragments thereof, and b) detectingthe presence or absence of said hybridization complex, and optionally,if present, the amount thereof. In one alternative, the probe comprisesat least 60 contiguous nucleotides.

[0016] The invention further provides a method for detecting a targetpolynucleotide in a sample, said target polynucleotide having a sequenceof a polynucleotide selected from the group consisting of a) apolynucleotide comprising a polynucleotide sequence selected from thegroup consisting of SEQ ID NO:27-52, b) a polynucleotide comprising anaturally occurring polynucleotide sequence at least 90% identical to apolynucleotide sequence selected from the group consisting of SEQ IDNO:27-52, c) a polynucleotide complementary to the polynucleotide of a),d) a polynucleotide complementary to the polynucleotide of b), and e) anRNA equivalent of a)-d). The method comprises a) amplifying said targetpolynucleotide or fragment thereof using polymerase chain reactionamplification, and b) detecting the presence or absence of saidamplified target polynucleotide or fragment thereof, and, optionally, ifpresent, the amount thereof.

[0017] The invention further provides a composition comprising aneffective amount of a polypeptide selected from the group consisting ofa) a polypeptide comprising an amino acid sequence selected from thegroup consisting of SEQ ID NO:1-26, b) a polypeptide comprising anaturally occurring amino acid sequence at least 90% identical to anamino acid sequence selected from the group consisting of SEQ IDNO:1-26, c) a biologically active fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ IDNO:1-26, and d) an immunogenic fragment of a polypeptide having an aminoacid sequence selected from the group consisting of SEQ ID NO:1-26, anda pharmaceutically acceptable excipient. In one embodiment, thecomposition comprises an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-26. The invention additionally provides amethod of treating a disease or condition associated with decreasedexpression of functional MDDT, comprising administering to a patient inneed of such treatment the composition.

[0018] The invention also provides a method for screening a compound foreffectiveness as an agonist of a polypeptide selected from the groupconsisting of a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO:1-26, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO:1-26, c) a biologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO:1-26, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ IDNO:1-26. The method comprises a) exposing a sample comprising thepolypeptide to a compound, and b) detecting agonist activity in thesample. In one alternative, the invention provides a compositioncomprising an agonist compound identified by the method and apharmaceutically acceptable excipient. In another alternative, theinvention provides a method of treating a disease or conditionassociated with decreased expression of functional MDDT, comprisingadministering to a patient in need of such treatment the composition.

[0019] Additionally, the invention provides a method for screening acompound for effectiveness as an antagonist of a polypeptide selectedfrom the group consisting of a) a polypeptide comprising an amino acidsequence selected from the group consisting of SEQ ID NO:1-26, b) apolypeptide comprising a naturally occurring amino acid sequence atleast 90% identical to an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-26, c) a biologically active fragment of apolypeptide having an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-26, and d) an immunogenic fragment of apolypeptide having an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-26. The method comprises a) exposing a samplecomprising the polypeptide to a compound, and b) detecting antagonistactivity in the sample. In one alternative, the invention provides acomposition comprising an antagonist compound identified by the methodand a pharmaceutically acceptable excipient. In another alternative, theinvention provides a method of treating a disease or conditionassociated with overexpression of functional MDDT, comprisingadministering to a patient in need of such treatment the composition.

[0020] The invention further provides a method of screening for acompound that specifically binds to a polypeptide selected from thegroup consisting of a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO:1-26, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO:1-26, c) a biologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO:1-26, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ IDNO:1-26. The method comprises a) combining the polypeptide with at leastone test compound under suitable conditions, and b) detecting binding ofthe polypeptide to the test compound, thereby identifying a compoundthat specifically binds to the polypeptide.

[0021] The invention further provides a method of screening for acompound that modulates the activity of a polypeptide selected from thegroup consisting of a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO:1-26, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO:1-26, c) a biologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO:1-26, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ IDNO:1-26. The method comprises a) combining the polypeptide with at leastone test compound under conditions permissive for the activity of thepolypeptide, b) assessing the activity of the polypeptide in thepresence of the test compound, and c) comparing the activity of thepolypeptide in the presence of the test compound with the activity ofthe polypeptide in the absence of the test compound, wherein a change inthe activity of the polypeptide in the presence of the test compound isindicative of a compound that modulates the activity of the polypeptide.

[0022] The invention further provides a method for screening a compoundfor effectiveness in altering expression of a target polynucleotide,wherein said target polynucleotide comprises a polynucleotide sequenceselected from the group consisting of SEQ ID NO:27-52, the methodcomprising a) exposing a sample comprising the target polynucleotide toa compound, b) detecting altered expression of the targetpolynucleotide, and c) comparing the expression of the targetpolynucleotide in the presence of varying amounts of the compound and inthe absence of the compound.

[0023] The invention further provides a method for assessing toxicity ofa test compound, said method comprising a) treating a biological samplecontaining nucleic acids with the test compound; b) hybridizing thenucleic acids of the treated biological sample with a probe comprisingat least 20 contiguous nucleotides of a polynucleotide selected from thegroup consisting of i) a polynucleotide comprising a polynucleotidesequence selected from the group consisting of SEQ ID NO:27-52, ii) apolynucleotide comprising a naturally occurring polynucleotide sequenceat least 90% identical to a polynucleotide sequence selected from thegroup consisting of SEQ ID NO:27-52, iii) a polynucleotide having asequence complementary to i), iv) a polynucleotide complementary to thepolynucleotide of ii), and v) an RNA equivalent of i)-iv). Hybridizationoccurs under conditions whereby a specific hybridization complex isformed between said probe and a target polynucleotide in the biologicalsample, said target polynucleotide selected from the group consisting ofi) a polynucleotide comprising a polynucleotide sequence selected fromthe group consisting of SEQ ID NO:27-52, ii) a polynucleotide comprisinga naturally occurring polynucleotide sequence at least 90% identical toa polynucleotide sequence selected from the group consisting of SEQ IDNO:27-52, iii) a polynucleotide complementary to the polynucleotide ofi), iv) a polynucleotide complementary to the polynucleotide of ii), andv) an RNA equivalent of i)-iv). Alternatively, the target polynucleotidecomprises a fragment of a polynucleotide sequence selected from thegroup consisting of i)-v) above; c) quantifying the amount ofhybridization complex; and d) comparing the amount of hybridizationcomplex in the treated biological sample with the amount ofhybridization complex in an untreated biological sample, wherein adifference in the amount of hybridization complex in the treatedbiological sample is indicative of toxicity of the test compound.

BRIEF DESCRIPTION OF THE TABLES

[0024] Table 1 summarizes the nomenclature for the full lengthpolynucleotide and polypeptide sequences of the present invention.

[0025] Table 2 shows the GenBank identification number and annotation ofthe nearest GenBank homolog for polypeptides of the invention. Theprobability scores for the matches between each polypeptide and itshomolog(s) are also shown.

[0026] Table 3 shows structural features of polypeptide sequences of theinvention, including predicted motifs and domains, along with themethods, algorithms, and searchable databases used for analysis of thepolypeptides.

[0027] Table 4 lists the cDNA and/or genomic DNA fragments which wereused to assemble polynucleotide sequences of the invention, along withselected fragments of the polynucleotide sequences.

[0028] Table 5 shows the representative cDNA library for polynucleotidesof the invention.

[0029] Table 6 provides an appendix which describes the tissues andvectors used for construction of the cDNA libraries shown in Table 5.

[0030] Table 7 shows the tools, programs, and algorithms used to analyzethe polynucleotides and polypeptides of the invention, along withapplicable descriptions, references, and threshold parameters.

DESCRIPTION OF THE INVENTION

[0031] Before the present proteins, nucleotide sequences, and methodsare described, it is understood that this invention is not limited tothe particular machines, materials and methods described, as these mayvary. It is also to be understood that the terminology used herein isfor the purpose of describing particular embodiments only, and is notintended to limit the scope of the present invention which will belimited only by the appended claims.

[0032] It must be noted that as used herein and in the appended claims,the singular forms “a,” “an,” and “the” include plural reference unlessthe context clearly dictates otherwise. Thus, for example, a referenceto “a host cell” includes a plurality of such host cells, and areference to “an antibody” is a reference to one or more antibodies andequivalents thereof known to those skilled in the art, and so forth.

[0033] Unless defined otherwise, all technical and scientific terms usedherein have the same meanings as commonly understood by one of ordinaryskill in the art to which this invention belongs. Although any machines,materials, and methods similar or equivalent to those described hereincan be used to practice or test the present invention, the preferredmachines, materials and methods are now described. All publicationsmentioned herein are cited for the purpose of describing and disclosingthe cell lines, protocols, reagents and vectors which are reported inthe publications and which might be used in connection with theinvention. Nothing herein is to be construed as an admission that theinvention is not entitled to antedate such disclosure by virtue of priorinvention.

DEFINITIONS

[0034] “MDDT” refers to the amino acid sequences of substantiallypurified MDDT obtained from any species, particularly a mammalianspecies, including bovine, ovine, porcine, murine, equine, and human,and from any source, whether natural, synthetic, semi-synthetic, orrecombinant.

[0035] The term “agonist” refers to a molecule which intensifies ormimics the biological activity of MDDT. Agonists may include proteins,nucleic acids, carbohydrates, small molecules, or any other compound orcomposition which modulates the activity of MDDT either by directlyinteracting with MDDT or by acting on components of the biologicalpathway in which MDDT participates.

[0036] An “allelic variant” is an alternative form of the gene encodingMDDT. Allelic variants may result from at least one mutation in thenucleic acid sequence and may result in altered mRNAs or in polypeptideswhose structure or function may or may not be altered. A gene may havenone, one, or many allelic variants of its naturally occurring form.Common mutational changes which give rise to allelic variants aregenerally ascribed to natural deletions, additions, or substitutions ofnucleotides. Each of these types of changes may occur alone, or incombination with the others, one or more times in a given sequence.

[0037] “Altered” nucleic acid sequences encoding MDDT include thosesequences with deletions, insertions, or substitutions of differentnucleotides, resulting in a polypeptide the same as MDDT or apolypeptide with at least one functional characteristic of MDDT.Included within this definition are polymorphisms which may or may notbe readily detectable using a particular oligonucleotide probe of thepolynucleotide encoding MDDT, and improper or unexpected hybridizationto allelic variants, with a locus other than the normal chromosomallocus for the polynucleotide sequence encoding MDDT. The encoded proteinmay also be “altered,” and may contain deletions, insertions, orsubstitutions of amino acid residues which produce a silent change andresult in a functionally equivalent MDDT. Deliberate amino acidsubstitutions may be made on the basis of similarity in polarity,charge, solubility, hydrophobicity, hydrophilicity, and/or theamphipathic nature of the residues, as long as the biological orimmunological activity of MDDT is retained. For example, negativelycharged amino acids may include aspartic acid and glutamic acid, andpositively charged amino acids may include lysine and arginine. Aminoacids with uncharged polar side chains having similar hydrophilicityvalues may include: asparagine and glutamine; and serine and threonine.Amino acids with uncharged side chains having similar hydrophilicityvalues may include: leucine, isoleucine, and valine; glycine andalanine; and phenylalanine and tyrosine.

[0038] The terms “amino acid” and “amino acid sequence” refer to anoligopeptide, peptide, polypeptide, or protein sequence, or a fragmentof any of these, and to naturally occurring or synthetic molecules.Where “amino acid sequence” is recited to refer to a sequence of anaturally occurring protein molecule, “amino acid sequence” and liketerms are not meant to limit the amino acid sequence to the completenative amino acid sequence associated with the recited protein molecule.

[0039] “Amplification” relates to the production of additional copies ofa nucleic acid sequence. Amplification is generally carried out usingpolymerase chain reaction (PCR) technologies well known in the art.

[0040] The term “antagonist” refers to a molecule which inhibits orattenuates the biological activity of MDDT. Antagonists may includeproteins such as antibodies, nucleic acids, carbohydrates, smallmolecules, or any other compound or composition which modulates theactivity of MDDT either by directly interacting with MDDT or by actingon components of the biological pathway in which MDDT participates.

[0041] The term “antibody” refers to intact immunoglobulin molecules aswell as to fragments thereof, such as Fab, F(ab′)₂, and Fv fragments,which are capable of binding an epitopic determinant. Antibodies thatbind MDDT polypeptides can be prepared using intact polypeptides orusing fragments containing small peptides of interest as the immunizingantigen. The polypeptide or oligopeptide used to immunize an animal(e.g., a mouse, a rat, or a rabbit) can be derived from the translationof RNA, or synthesized chemically, and can be conjugated to a carrierprotein if desired. Commonly used carriers that are chemically coupledto peptides include bovine serum albumin, thyroglobulin, and keyholelimpet hemocyanin (KLH). The coupled peptide is then used to immunizethe animal.

[0042] The term “antigenic determinant” refers to that region of amolecule (i.e., an epitope) that makes contact with a particularantibody. When a protein or a fragment of a protein is used to immunizea host animal, numerous regions of the protein may induce the productionof antibodies which bind specifically to antigenic determinants(particular regions or three-dimensional structures on the protein). Anantigenic determinant may compete with the intact antigen (i.e., theimmunogen used to elicit the immune response) for binding to anantibody.

[0043] The term “aptamer” refers to a nucleic acid or oligonucleotidemolecule that binds to a specific molecular target. Aptamers are derivedfrom an in vitro evolutionary process (e.g., SELEX (Systematic Evolutionof Ligands by EXponential Enrichment), described in U.S. Pat. No.5,270,163), which selects for target-specific aptamer sequences fromlarge combinatorial libraries. Aptamer compositions may bedouble-stranded or single-stranded, and may includedeoxyribonucleotides, ribonucleotides, nucleotide derivatives, or othernucleotide-like molecules. The nucleotide components of an aptamer mayhave modified sugar groups (e.g., the 2′-OH group of a ribonucleotidemay be replaced by 2′-F or 2′-NH₂), which may improve a desiredproperty, e.g., resistance to nucleases or longer lifetime in blood.Aptamers may be conjugated to other molecules, e.g., a high molecularweight carrier to slow clearance of the aptamer from the circulatorysystem. Aptamers may be specifically cross-linked to their cognateligands, e.g., by photo-activation of a cross-linker. (See, e.g., Brody,E. N. and L. Gold (2000) J. Biotechnol. 74:5-13.)

[0044] The term “intramer” refers to an aptamer which is expressed invivo. For example, a vaccinia virus-based RNA expression system has beenused to express specific RNA aptamers at high levels in the cytoplasm ofleukocytes (Blind, M. et al. (1999) Proc. Natl Acad. Sci. USA96:3606-3610).

[0045] The term “spiegelmer” refers to an aptamer which includes L-DNA,L-RNA, or other left-handed nucleotide derivatives or nucleotide-likemolecules. Aptamers containing left-handed nucleotides are resistant todegradation by naturally occurring enzymes, which normally act onsubstrates containing right-handed nucleotides.

[0046] The term “antisense” refers to any composition capable ofbase-pairing with the “sense” (coding) strand of a specific nucleic acidsequence. Antisense compositions may include DNA; RNA; peptide nucleicacid (PNA); oligonucleotides having modified backbone linkages such asphosphorothioates, methylphosphonates, or benzylphosphonates;oligonucleotides having modified sugar groups such as 2′-methoxyethylsugars or 2′-methoxyethoxy sugars; or oligonucleotides having modifiedbases such as 5-methyl cytosine, 2′-deoxyuracil, or7-deaza-2′-deoxyguanosine. Antisense molecules may be produced by anymethod including chemical synthesis or transcription. Once introducedinto a cell, the complementary antisense molecule base-pairs with anaturally occurring nucleic acid sequence produced by the cell to formduplexes which block either transcription or translation. Thedesignation “negative” or “minus” can refer to the antisense strand, andthe designation “positive” or “plus” can refer to the sense strand of areference DNA molecule.

[0047] The term “biologically active” refers to a protein havingstructural, regulatory, or biochemical functions of a naturallyoccurring molecule. Likewise, “immunologically active” or “immunogenic”refers to the capability of the natural, recombinant, or synthetic MDDT,or of any oligopeptide thereof, to induce a specific immune response inappropriate animals or cells and to bind with specific antibodies.

[0048] “Complementary” describes the relationship between twosingle-stranded nucleic acid sequences that anneal by base-pairing. Forexample, 5′-AGT-3′pairs with its complement, 3′-TCA-5′.

[0049] A “composition comprising a given polynucleotide sequence” and a“composition comprising a given amino acid sequence” refer broadly toany composition containing the given polynucleotide or amino acidsequence. The composition may comprise a dry formulation or an aqueoussolution. Compositions comprising polynucleotide sequences encoding MDDTor fragments of MDDT may be employed as hybridization probes. The probesmay be stored in freeze-dried form and may be associated with astabilizing agent such as a carbohydrate. In hybridizations, the probemay be deployed in an aqueous solution containing salts (e.g., NaCl),detergents (e.g., sodium dodecyl sulfate; SDS), and other components(e.g., Denhardt's solution, dry milk, salmon sperm DNA, etc.).

[0050] “Consensus sequence” refers to a nucleic acid sequence which hasbeen subjected to repeated DNA sequence analysis to resolve uncalledbases, extended using the XL-PCR kit (Applied Biosystems, Foster CityCalif.) in the 5′ and/or the 3′ direction, and resequenced, or which hasbeen assembled from one or more overlapping cDNA, EST, or genomic DNAfragments using a computer program for fragment assembly, such as theGELVIEW fragment assembly system (GCG, Madison Wis.) or Phrap(University of Washington, Seattle Wash.). Some sequences have been bothextended and assembled to produce the consensus sequence.

[0051] “Conservative amino acid substitutions” are those substitutionsthat are predicted to least interfere with the properties of theoriginal protein, i.e., the structure and especially the function of theprotein is conserved and not significantly changed by suchsubstitutions. The table below shows amino acids which may besubstituted for an original amino acid in a protein and which areregarded as conservative amino acid substitutions. Original ResidueConservative Substitution Ala Gly, Ser Arg His, Lys Asn Asp, Gln, HisAsp Asn, Glu Cys Ala, Ser Gln Asn, Glu, His Glu Asp, Gln, His Gly AlaHis Asn, Arg, Gln, Glu Ile Leu, Val Leu Ile, Val Lys Arg, Gln, Glu MetLeu, Ile Phe His, Met, Leu, Trp, Tyr Ser Cys, Thr Thr Ser, Val Trp Phe,Tyr Tyr His, Phe, Trp Val Ile, Leu, Thr

[0052] Conservative amino acid substitutions generally maintain (a) thestructure of the polypeptide backbone in the area of the substitution,for example, as a beta sheet or alpha helical conformation, (b) thecharge or hydrophobicity of the molecule at the site of thesubstitution, and/or (c) the bulk of the side chain.

[0053] A “deletion” refers to a change in the amino acid or nucleotidesequence that results in the absence of one or more amino acid residuesor nucleotides.

[0054] The term “derivative” refers to a chemically modifiedpolynucleotide or polypeptide. Chemical modifications of apolynucleotide can include, for example, replacement of hydrogen by analkyl, acyl, hydroxyl, or amino group. A derivative polynucleotideencodes a polypeptide which retains at least one biological orimmunological function of the natural molecule. A derivative polypeptideis one modified by glycosylation, pegylation, or any similar processthat retains at least one biological or immunological function of thepolypeptide from which it was derived.

[0055] A “detectable label” refers to a reporter molecule or enzyme thatis capable of generating a measurable signal and is covalently ornoncovalently joined to a polynucleotide or polypeptide.

[0056] “Differential expression” refers to increased or upregulated; ordecreased, downregulated, or absent gene or protein expression,determined by comparing at least two different samples. Such comparisonsmay be carried out between, for example, a treated and an untreatedsample, or a diseased and a normal sample.

[0057] “Exon shuffling” refers to the recombination of different codingregions (exons). Since an exon may represent a structural or functionaldomain of the encoded protein, new proteins may be assembled through thenovel reassortment of stable substructures, thus allowing accelerationof the evolution of new protein functions.

[0058] A “fragment” is a unique portion of MDDT or the polynucleotideencoding MDDT which is identical in sequence to but shorter in lengththan the parent sequence. A fragment may comprise up to the entirelength of the defined sequence, minus one nucleotide/amino acid residue.For example, a fragment may comprise from 5 to 1000 contiguousnucleotides or amino acid residues. A fragment used as a probe, primer,antigen, therapeutic molecule, or for other purposes, maybe at least 5,10, 15, 16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500contiguous nucleotides or amino acid residues in length. Fragments maybe preferentially selected from certain regions of a molecule. Forexample, a polypeptide fragment may comprise a certain length ofcontiguous amino acids selected from the first 250 or 500 amino acids(or first 25% or 50%) of a polypeptide as shown in a certain definedsequence. Clearly these lengths are exemplary, and any length that issupported by the specification, including the Sequence Listing, tables,and figures, may be encompassed by the present embodiments.

[0059] A fragment of SEQ ID NO:27-52 comprises a region of uniquepolynucleotide sequence that specifically identifies SEQ ID NO:27-52,for example, as distinct from any other sequence in the genome fromwhich the fragment was obtained. A fragment of SEQ ID NO:27-52 isuseful, for example, in hybridization and amplification technologies andin analogous methods that distinguish SEQ ID NO:27-52 from relatedpolynucleotide sequences. The precise length of a fragment of SEQ IDNO:27-52 and the region of SEQ ID NO:27-52 to which the fragmentcorresponds are routinely determinable by one of ordinary skill in theart based on the intended purpose for the fragment.

[0060] A fragment of SEQ ID NO:1-26 is encoded by a fragment of SEQ IDNO:27-52. A fragment of SEQ ID NO:1-26 comprises a region of uniqueamino acid sequence that specifically identifies SEQ ID NO:1-26. For.example, a fragment of SEQ ID NO:1-26 is useful as an immunogenicpeptide for the development of antibodies that specifically recognizeSEQ ID NO: 1-26. The precise length of a fragment of SEQ ID NO:1-26 andthe region of SEQ ID NO:1-26 to which the fragment corresponds areroutinely determinable by one of ordinary skill in the art based on theintended purpose for the fragment.

[0061] A “full length” polynucleotide sequence is one containing atleast a translation initiation codon (e.g., methionine) followed by anopen reading frame and a translation termination codon. A “full length”polynucleotide sequence encodes a “full length” polypeptide sequence.

[0062] “Homology” refers to sequence similarity or, interchangeably,sequence identity, between two or more polynucleotide sequences or twoor more polypeptide sequences.

[0063] The terms “percent identity” and “% identity,” as applied topolynucleotide sequences, refer to the percentage of residue matchesbetween at least two polynucleotide sequences aligned using astandardized algorithm. Such an algorithm may insert, in a standardizedand reproducible way, gaps in the sequences being compared in order tooptimize alignment between two sequences, and therefore achieve a moremeaningful comparison of the two sequences.

[0064] Percent identity between polynucleotide sequences may bedetermined using the default parameters of the CLUSTAL V algorithm asincorporated into the MEGALIGN version 3.12e sequence alignment program.This program is part of the LASERGENE software package, a suite ofmolecular biological analysis programs (DNASTAR, Madison Wis.). CLUSTALV is described in Higgins, D. G. and P. M. Sharp (1989) CABIOS 5:151-153and in Higgins, D. G. et al. (1992) CABIOS 8:189-191. For pairwisealignments of polynucleotide sequences, the default parameters are setas follows: Ktuple=2, gap penalty=5, window=4, and “diagonals saved”=4.The “weighted” residue weight table is selected as the default. Percentidentity is reported by CLUSTAL V as the “percent similarity” betweenaligned polynucleotide sequences.

[0065] Alternatively, a suite of commonly used and freely availablesequence comparison algorithms is provided by the National Center forBiotechnology Information (NCBI) Basic Local Alignment Search Tool(BLAST) (Altschul, S. F. et al. (1990) J. Mol. Biol. 215:403-410), whichis available from several sources, including the NCBI, Bethesda, Md.,and on the Internet at http://www.ncbi.nlm.nih.gov/BLAST/. The BLASTsoftware suite includes various sequence analysis programs including“blastn,” that is used to align a known polynucleotide sequence withother polynucleotide sequences from a variety of databases. Alsoavailable is a tool called “BLAST 2 Sequences” that is used for directpairwise comparison of two nucleotide sequences. “BLAST 2 Sequences” canbe accessed and used interactively athttp://www.ncbi.nlm.nih.gov/gorf/bl2.html. The “BLAST 2 Sequences” toolcan be used for both blastn and blastp (discussed below). BLAST programsare commonly used with gap and other parameters set to default settings.For example, to compare two nucleotide sequences, one may use blastnwith the “BLAST 2 Sequences” tool Version 2.0.12 (Apr. 21, 2000) set atdefault parameters. Such default parameters may be, for example:

[0066] Matrix: BLOSUM62

[0067] Reward for match: 1

[0068] Penalty for mismatch: -2

[0069] Open Gap: 5 and Extension Gap: 2 penalties

[0070] Gap x drop-off 50

[0071] Expect: 10

[0072] Word Size: 11

[0073] Filter: on

[0074] Percent identity may be measured over the length of an entiredefined sequence, for example, as defined by a particular SEQ ID number,or may be measured over a shorter length, for example, over the lengthof a fragment taken from a larger, defined sequence, for instance, afragment of at least 20, at least 30, at least 40, at least 50, at least70, at least 100, or at least 200 contiguous nucleotides. Such lengthsare exemplary only, and it is understood that any fragment lengthsupported by the sequences shown herein, in the tables, figures, orSequence Listing, may be used to describe a length over which percentageidentity may be measured.

[0075] Nucleic acid sequences that do not show a high degree of identitymay nevertheless encode similar amino acid sequences due to thedegeneracy of the genetic code. It is understood that changes in anucleic acid sequence can be made using this degeneracy to producemultiple nucleic acid sequences that all encode substantially the sameprotein.

[0076] The phrases “percent identity” and “% identity,” as applied topolypeptide sequences, refer to the percentage of residue matchesbetween at least two polypeptide sequences aligned using a standardizedalgorithm. Methods of polypeptide sequence alignment are well-known.Some alignment methods take into account conservative amino acidsubstitutions. Such conservative substitutions, explained in more detailabove, generally preserve the charge and hydrophobicity at the site ofsubstitution, thus preserving the structure (and therefore function) ofthe polypeptide.

[0077] Percent identity between polypeptide sequences may be determinedusing the default parameters of the CLUSTAL V algorithm as incorporatedinto the MEGALIGN version 3.12e sequence alignment program (describedand referenced above). For pairwise alignments of polypeptide sequencesusing CLUSTAL V, the default parameters are set as follows: Ktuple=1,gap penalty=3, window=5, and “diagonals saved”=5. The PAM250 matrix isselected as the default residue weight table. As with polynucleotidealignments, the percent identity is reported by CLUSTAL V as the“percent similarity” between aligned polypeptide sequence pairs.

[0078] Alternatively the NCBI BLAST software suite may be used. Forexample, for a pairwise comparison of two polypeptide sequences, one mayuse the “BLAST 2 Sequences” tool Version 2.0.12 (Apr. 21, 2000) withblastp set at default parameters. Such default parameters may be, forexample:

[0079] Matrix: BLOSUM62

[0080] Open Gap: 11 and Extension Gap: 1 penalties

[0081] Gap x drop-off: 50

[0082] Expect: 10

[0083] Word Size: 3

[0084] Filter: on

[0085] Percent identity may be measured over the length of an entiredefined polypeptide sequence, for example, as defined by a particularSEQ ID number, or may be measured over a shorter length, for example,over the length of a fragment taken from a larger, defined polypeptidesequence, for instance, a fragment of at least 15, at least 20, at least30, at least 40, at least 50, at least 70 or at least 150 contiguousresidues. Such lengths are exemplary only, and it is understood that anyfragment length supported by the sequences shown herein, in the tables,figures or Sequence Listing, may be used to describe a length over whichpercentage identity may be measured.

[0086] “Human artificial chromosomes” (HACs) are linear microchromosomeswhich may contain DNA sequences of about 6 kb to 10 Mb in size and whichcontain all of the elements required for chromosome replication,segregation and maintenance.

[0087] The term “humanized antibody” refers to an antibody molecule inwhich the amino acid sequence in the non-antigen binding regions hasbeen altered so that the antibody more closely resembles a humanantibody, and still retains its original binding ability.

[0088] “Hybridization” refers to the process by which a polynucleotidestrand anneals with a complementary strand through base pairing underdefined hybridization conditions. Specific hybridization is anindication that two nucleic acid sequences share a high degree ofcomplementarity. Specific hybridization complexes form under permissiveannealing conditions and remain hybridized after the “washing” step(s).The washing step(s) is particularly important in determining thestringency of the hybridization process, with more stringent conditionsallowing less non-specific binding, i.e., binding between pairs ofnucleic acid strands that are not perfectly matched. Permissiveconditions for annealing of nucleic acid sequences are routinelydeterminable by one of ordinary skill in the art and may be consistentamong hybridization experiments, whereas wash conditions may be variedamong experiments to achieve the desired stringency, and thereforehybridization specificity. Permissive annealing conditions occur, forexample, at 68° C. in the presence of about 6×SSC, about 1% (w/v) SDS,and about 100 μg/ml sheared, denatured salmon sperm DNA.

[0089] Generally, stringency of hybridization is expressed, in part,with reference to the temperature under which the wash step is carriedout. Such wash temperatures are typically selected to be about 5° C. to20° C. lower than the thermal melting point (T_(m)) for the specificsequence at a defined ionic strength and pH. The T_(m) is thetemperature (under defined ionic strength and pH) at which 50% of thetarget sequence hybridizes to a perfectly matched probe. An equation forcalculating T_(m) and conditions for nucleic acid hybridization are wellknown and can be found in Sambrook, J. et al. (1989) Molecular Cloning:A Laboratory Manual, 2^(nd) ed., vol. 1-3, Cold Spring Harbor Press,Plainview N.Y.; specifically see volume 2, chapter 9.

[0090] High stringency conditions for hybridization betweenpolynucleotides of the present invention include wash conditions of 68°C. in the presence of about 0.2×SSC and about 0.1% SDS, for 1 hour.Alternatively, temperatures of about 65° C., 60° C., 55° C., or 42° C.may be used. SSC concentration may be varied from about 0.1 to 233 SSC,with SDS being present at about 0.1%. Typically, blocking reagents areused to block non-specific hybridization. Such blocking reagentsinclude, for instance, sheared and denatured salmon sperm DNA at about100-200 μg/ml. Organic solvent, such as formamide at a concentration ofabout 35-50% v/v, may also be used under particular circumstances, suchas for RNA:DNA hybridizations. Useful variations on these washconditions will be readily apparent to those of ordinary skill in theart. Hybridization, particularly under high stringency conditions, maybe suggestive of evolutionary similarity between the nucleotides. Suchsimilarity is strongly indicative of a similar role for the nucleotidesand their encoded polypeptides.

[0091] The term “hybridization complex” refers to a complex formedbetween two nucleic acid sequences by virtue of the formation ofhydrogen bonds between complementary bases. A hybridization complex maybe formed in solution (e.g., C₀t or R₀t analysis) or formed between onenucleic acid sequence present in solution and another nucleic acidsequence immobilized on a solid support (e.g., paper, membranes,filters, chips, pins or glass slides, or any other appropriate substrateto which cells or their nucleic acids have been fixed).

[0092] The words “insertion” and “addition” refer to changes in an aminoacid or nucleotide sequence resulting in the addition of one or moreamino acid residues or nucleotides, respectively.

[0093] “Immune response” can refer to conditions associated withinflammation, trauma, immune disorders, or infectious or geneticdisease, etc. These conditions can be characterized by expression ofvarious factors, e.g., cytokines, chemokines, and other signalingmolecules, which may affect cellular and systemic defense systems.

[0094] An “immunogenic fragment” is a polypeptide or oligopeptidefragment of MDDT which is capable of eliciting an immune response whenintroduced into a living organism, for example, a mammal. The term“immunogenic fragment” also includes any polypeptide or oligopeptidefragment of MDDT which is useful in any of the antibody productionmethods disclosed herein or known in the art.

[0095] The term “microarray” refers to an arrangement of a plurality ofpolynucleotides, polypeptides, or other chemical compounds on asubstrate.

[0096] The terms “element” and “array element” refer to apolynucleotide, polypeptide, or other chemical compound having a uniqueand defined position on a microarray.

[0097] The term “modulate” refers to a change in the activity of MDDT.For example, modulation may cause an increase or a decrease in proteinactivity, binding characteristics, or any other biological, functional,or immunological properties of MDDT.

[0098] The phrases “nucleic acid” and “nucleic acid sequence” refer to anucleotide, oligonucleotide, polynucleotide, or any fragment thereof.These phrases also refer to DNA or RNA of genomic or synthetic originwhich may be single-stranded or double-stranded and may represent thesense or the antisense strand, to peptide nucleic acid (PNA), or to anyDNA-like or RNA-like material.

[0099] “Operably linked” refers to the situation in which a firstnucleic acid sequence is placed in a functional relationship with asecond nucleic acid sequence. For instance, a promoter is operablylinked to a coding sequence if the promoter affects the transcription orexpression of the coding sequence. Operably linked DNA sequences may bein close proximity or contiguous and, where necessary to join twoprotein coding regions, in the same reading frame.

[0100] “Peptide nucleic acid” (PNA) refers to an antisense molecule oranti-gene agent which comprises an oligonucleotide of at least about 5nucleotides in length linked to a peptide backbone of amino acidresidues ending in lysine. The terminal lysine confers solubility to thecomposition. PNAs preferentially bind complementary single stranded DNAor RNA and stop transcript elongation, and may be pegylated to extendtheir lifespan in the cell.

[0101] “Post-translational modification” of an MDDT may involvelipidation, glycosylation, phosphorylation, acetylation, racemization,proteolytic cleavage, and other modifications known in the art. Theseprocesses may occur synthetically or biochemically. Biochemicalmodifications will vary by cell type depending on the enzymatic milieuof MDDT.

[0102] “Probe” refers to nucleic acid sequences encoding MDDT, theircomplements, or fragments thereof, which are used to detect identical,allelic or related nucleic acid sequences. Probes are isolatedoligonucleotides or polynucleotides attached to a detectable label orreporter molecule. Typical labels include radioactive isotopes, ligands,chemiluminescent agents, and enzymes. “Primers” are short nucleic acids,usually DNA oligonucleotides, which may be annealed to a targetpolynucleotide by complementary base-pairing. The primer may then beextended along the target DNA strand by a DNA polymerase enzyme. Primerpairs can be used for amplification (and identification) of a nucleicacid sequence, e.g., by the polymerase chain reaction (PCR).

[0103] Probes and primers as used in the present invention typicallycomprise at least 15 contiguous nucleotides of a known sequence. Inorder to enhance specificity, longer probes and primers may also beemployed, such as probes and primers that comprise at least 20, 25, 30,40, 50, 60, 70, 80, 90, 100, or at least 150 consecutive nucleotides ofthe disclosed nucleic acid sequences. Probes and primers may beconsiderably longer than these examples, and it is understood that anylength supported by the specification, including the tables, figures,and Sequence Listing, may be used.

[0104] Methods for preparing and using probes and primers are describedin the references, for example Sambrook, J. et al. (1989) MolecularCloning: A Laboratory Manual, 2^(nd) ed., vol. 1-3, Cold Spring HarborPress, Plainview N.Y.; Ausubel, F. M. et al. (1987) Current Protocols inMolecular Biology, Greene Publ. Assoc. & Wiley-Intersciences, New YorkN.Y.; Innis, M. et al. (1990) PCR Protocols, A Guide to Methods andApplications, Academic Press, San Diego Calif. PCR primer pairs can bederived from a known sequence, for example, by using computer programsintended for that purpose such as Primer (Version 0.5, 1991, WhiteheadInstitute for Biomedical Research, Cambridge Mass.).

[0105] Oligonucleotides for use as primers are selected using softwareknown in the art for such purpose. For example, OLIGO 4.06 software isuseful for the selection of PCR primer pairs of up to 100 nucleotideseach, and for the analysis of oligonucleotides and largerpolynucleotides of up to 5,000 nucleotides from an input polynucleotidesequence of up to 32 kilobases. Similar primer selection programs haveincorporated additional features for expanded capabilities. For example,the PrimOU primer selection program (available to the public from theGenome Center at University of Texas South West Medical Center, DallasTex.) is capable of choosing specific primers from megabase sequencesand is thus useful for designing primers on a genome-wide scope. ThePrimer3 primer selection program (available to the public from theWhitehead Institute/MIT Center for Genome Research, Cambridge Mass.)allows the user to input a “mispriming library,” in which sequences toavoid as primer binding sites are user-specified. Primer3 is useful, inparticular, for the selection of oligonucleotides for microarrays. (Thesource code for the latter two primer selection programs may also beobtained from their respective sources and modified to meet the user'sspecific needs.) The PrimeGen program (available to the public from theUK Human Genome Mapping Project Resource Centre, Cambridge UK) designsprimers based on multiple sequence alignments, thereby allowingselection of primers that hybridize to either the most conserved orleast conserved regions of aligned nucleic acid sequences. Hence, thisprogram is useful for identification of both unique and conservedoligonucleotides and polynucleotide fragments. The oligonucleotides andpolynucleotide fragments identified by any of the above selectionmethods are useful in hybridization technologies, for example, as PCR orsequencing primers, microarray elements, or specific probes to identifyfully or partially complementary polynucleotides in a sample of nucleicacids. Methods of oligonucleotide selection are not limited to thosedescribed above.

[0106] A “recombinant nucleic acid” is a sequence that is not naturallyoccurring or has a sequence that is made by an artificial combination oftwo or more otherwise separated segments of sequence. This artificialcombination is often accomplished by chemical synthesis or, morecommonly, by the artificial manipulation of isolated segments of nucleicacids, e.g., by genetic engineering techniques such as those describedin Sambrook, supra. The term recombinant includes nucleic acids thathave been altered solely by addition, substitution, or deletion of aportion of the nucleic acid. Frequently, a recombinant nucleic acid mayinclude a nucleic acid sequence operably linked to a promoter sequence.Such a recombinant nucleic acid may be part of a vector that is used,for example, to transform a cell.

[0107] Alternatively, such recombinant nucleic acids may be part of aviral vector, e.g., based on a vaccinia virus, that could be use tovaccinate a mammal wherein the recombinant nucleic acid is expressed,inducing a protective immunological response in the mammal.

[0108] A “regulatory element” refers to a nucleic acid sequence usuallyderived from untranslated regions of a gene and includes enhancers,promoters, introns, and 5′ and 3′ untranslated regions (UTRs).Regulatory elements interact with host or viral proteins which controltranscription, translation, or RNA stability.

[0109] “Reporter molecules” are chemical or biochemical moieties usedfor labeling a nucleic acid, amino acid, or antibody. Reporter moleculesinclude radionuclides; enzymes; fluorescent, chemiluminescent, orchromogenic agents; substrates; cofactors; inhibitors; magneticparticles; and other moieties known in the art.

[0110] An “RNA equivalent,” in reference to a DNA sequence, is composedof the same linear sequence of nucleotides as the reference DNA sequencewith the exception that all occurrences of the nitrogenous base thymineare replaced with uracil, and the sugar backbone is composed of riboseinstead of deoxyribose.

[0111] The term “sample” is used in its broadest sense. A samplesuspected of containing MDDT, nucleic acids encoding MDDT, or fragmentsthereof may comprise a bodily fluid; an extract from a cell, chromosome,organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA,or cDNA, in solution or bound to a substrate; a tissue; a tissue print;etc.

[0112] The terms “specific binding” and “specifically binding” refer tothat interaction between a protein or peptide and an agonist, anantibody, an antagonist, a small molecule, or any natural or syntheticbinding composition. The interaction is dependent upon the presence of aparticular structure of the protein, e.g., the antigenic determinant orepitope, recognized by the binding molecule. For example, if an antibodyis specific for epitope “A,” the presence of a polypeptide comprisingthe epitope A, or the presence of free unlabeled A, in a reactioncontaining free labeled A and the antibody will reduce the amount oflabeled A that binds to the antibody.

[0113] The term “substantially purified” refers to nucleic acid or aminoacid sequences that are removed from their natural environment and areisolated or separated, and are at least 60% free, preferably at least75% free, and most preferably at least 90% free from other componentswith which they are naturally associated.

[0114] A “substitution” refers to the replacement of one or more aminoacid residues or nucleotides by different amino acid residues ornucleotides, respectively.

[0115] “Substrate” refers to any suitable rigid or semi-rigid supportincluding membranes, filters, chips, slides, wafers, fibers, magnetic ornonmagnetic beads, gels, tubing, plates, polymers, microparticles andcapillaries. The substrate can have a variety of surface forms, such aswells, trenches, pins, channels and pores, to which polynucleotides orpolypeptides are bound.

[0116] A “transcript image” or “expression profile” refers to thecollective pattern of gene expression by a particular cell type ortissue under given conditions at a given time.

[0117] “Transformation” describes a process by which exogenous DNA isintroduced into a recipient cell. Transformation may occur under naturalor artificial conditions according to various methods well known in theart, and may rely on any known method for the insertion of foreignnucleic acid sequences into a prokaryotic or eukaryotic host cell. Themethod for transformation is selected based on the type of host cellbeing transformed and may include, but is not limited to, bacteriophageor viral infection, electroporation, heat shock, lipofection, andparticle bombardment. The term “transformed cells” includes stablytransformed cells in which the inserted DNA is capable of replicationeither as an autonomously replicating plasmid or as part of the hostchromosome, as well as transiently transformed cells which express theinserted DNA or RNA for limited periods of time.

[0118] A “transgenic organism,” as used herein, is any organism,including but not limited to animals and plants, in which one or more ofthe cells of the organism contains heterologous nucleic acid introducedby way of human intervention, such as by transgenic techniques wellknown in the art. The nucleic acid is introduced into the cell, directlyor indirectly by introduction into a precursor of the cell, by way ofdeliberate genetic manipulation, such as by microinjection or byinfection with a recombinant virus. The term genetic manipulation doesnot include classical cross-breeding, or in vitro fertilization, butrather is directed to the introduction of a recombinant DNA molecule.The transgenic organisms contemplated in accordance with the presentinvention include bacteria, cyanobacteria, fungi, plants and animals.The isolated DNA of the present invention can be introduced into thehost by methods known in the art, for example infection, transfection,transformation or transconjugation. Techniques for transferring the DNAof the present invention into such organisms are widely known andprovided in references such as Sambrook et al. (1989), supra.

[0119] A “variant” of a particular nucleic acid sequence is defined as anucleic acid sequence having at least 40% sequence identity to theparticular nucleic acid sequence over a certain length of one of thenucleic acid sequences using blastn with the “BLAST 2 Sequences” toolVersion 2.0.9 (May 7, 1999) set at default parameters. Such a pair ofnucleic acids may show, for example, at least 50%, at least 60%, atleast 70%, at least 80%, at least 85%, at least 90%, at least 91%, atleast 92%, at least 93%, at least 94%, at least 95%, at least 96%, atleast 97%, at least 98%, or at least 99% or greater sequence identityover a certain defined length. A variant may be described as, forexample, an “allelic” (as defined above), “splice,” “species,” or“polymorphic” variant. A splice variant may have significant identity toa reference molecule, but will generally have a greater or lesser numberof polynucleotides due to alternate splicing of exons during mRNAprocessing. The corresponding polypeptide may possess additionalfunctional domains or lack domains that are present in the referencemolecule. Species variants are polynucleotide sequences that vary fromone species to another. The resulting polypeptides will generally havesignificant amino acid identity relative to each other. A polymorphicvariant is a variation in the polynucleotide sequence of a particulargene between individuals of a given species. Polymorphic variants alsomay encompass “single nucleotide polymorphisms” (SNPs) in which thepolynucleotide sequence varies by one nucleotide base. The presence ofSNPs may be indicative of, for example, a certain population, a diseasestate, or a propensity for a disease state.

[0120] A “variant” of a particular polypeptide sequence is defined as apolypeptide sequence having at least 40% sequence identity to theparticular polypeptide sequence over a certain length of one of thepolypeptide sequences using blastp with the “BLAST 2 Sequences” toolVersion 2.0.9 (May 7, 1999) set at default parameters. Such a pair ofpolypeptides may show, for example, at least 50%, at least 60%, at least70%, at least 80%, at least 90%, at least 91%, at least 92%, at least93%, at least 94%, at least 95%, at least 96%, at least 97%, at least98%, or at least 99% or greater sequence identity over a certain definedlength of one of the polypeptides.

[0121] THE INVENTION

[0122] The invention is based on the discovery of new human moleculesfor disease detection and treatment (MDDT), the polynucleotides encodingMDDT, and the use of these compositions for the diagnosis, treatment, orprevention of cell proliferative, autoimmune/inflammatory,developmental, and neurological disorders.

[0123] Table 1 summarizes the nomenclature for the full lengthpolynucleotide and polypeptide sequences of the invention. Eachpolynucleotide and its corresponding polypeptide are correlated to asingle Incyte project identification number (Incyte Project ID). Eachpolypeptide sequence is denoted by both a polypeptide sequenceidentification number (Polypeptide SEQ ID NO:) and an Incyte polypeptidesequence number (Incyte Polypeptide ID) as shown. Each polynucleotidesequence is denoted by both a polynucleotide sequence identificationnumber (Polynucleotide SEQ ID NO:) and an Incyte polynucleotideconsensus sequence number (Incyte Polynucleotide ID) as shown.

[0124] Table 2 shows sequences with homology to the polypeptides of theinvention as identified by BLAST analysis against the GenBank protein(genpept) database. Columns 1 and 2 show the polypeptide sequenceidentification number (Polypeptide SEQ ID NO:) and the correspondingIncyte polypeptide sequence number (Incyte Polypeptide ID) forpolypeptides of the invention. Column 3 shows the GenBank identificationnumber (GenBank ID NO:) of the nearest GenBank homolog. Column 4 showsthe probability scores for the matches between each polypeptide and itshomolog(s). Column 5 shows the annotation of the GenBank homolog(s)along with relevant citations where applicable, all of which areexpressly incorporated by reference herein.

[0125] Table 3 shows various structural features of the polypeptides ofthe invention. Columns 1 and 2 show the polypeptide sequenceidentification number (SEQ ID NO:) and the corresponding Incytepolypeptide sequence number (Incyte Polypeptide ID) for each polypeptideof the invention. Column 3 shows the number of amino acid residues ineach polypeptide. Column 4 shows potential phosphorylation sites, andcolumn 5 shows potential glycosylation sites, as determined by theMOTIFS program of the GCG sequence analysis software package (GeneticsComputer Group, Madison Wis.). Column 6 shows amino acid residuescomprising signature sequences, domains, and motifs. Column 7 showsanalytical methods for protein structure/function analysis and in somecases, searchable databases to which the analytical methods wereapplied.

[0126] Together, Tables 2 and 3 summarize the properties of polypeptidesof the invention, and these properties establish that the claimedpolypeptides are molecules for disease detection and treatment. Forexample, SEQ ID NO:11 is 47% identical to Escherichia coli putativenucleotide-binding protein (GenBank ID g1789285) as determined by theBasic Local Alignment Search Tool (BLAST). (See Table 2.) The BLASTprobability score is 1.0e-63, which indicates the probability ofobtaining the observed polypeptide sequence alignment by chance. Datafrom MOTlFS and BLAST analyses provide further corroborative evidencethat SEQ ID NO:11 is a nucleotide-binding protein. In a further example,SEQ ID NO:15 is 57% identical to bovine xenobiotic/medium-chain fattyacid:CoA ligase form XL-III (GenBank ID g5070357) as determined by theBasic Local Alignment Search Tool (BLAST). (See Table 2.) The BLASTprobability score is 6.7e-181, which indicates the probability ofobtaining the observed polypeptide sequence alignment by chance. SEQ IDNO:15 also contains an AMP-binding enzyme domain as determined bysearching for statistically significant matches in the hidden Markovmodel (HMM)-based PFAM database of conserved protein family domains.(See Table 3.) Data from BLIMPS, MOTIFS, and BLAST analyses providefurther corroborative evidence that SEQ ID NO:15 is an acyl CoA ligasewith an AMP binding domain. In yet a further example, SEQ ID NO:23 is40% identical from residues 1 to 268 to human NM23-H8 (GenBank IDg7580490) as determined by the Basic Local Alignment Search Tool(BLAST). (See Table 2.) The nm23 genes were discovered on the basis oftheir reduced expression in highly metastatic cell lines. Nm23 proteinsfrom various species display nucleoside diphosphate (NDP) kinaseactivity and are involved in tissue development and differentiation(Freije, J. M. et al. (1998) Biochem. Soc. Symp. 63:261-271). The BLASTprobability score is 1.8e-45, which indicates the probability ofobtaining the observed polypeptide sequence alignment by chance. SEQ IDNO:23 also contains a thioredoxin family active site and a nucleosidediphosphate kinases active site as determined by PROFILESCAN analysis.(See Table 3.) Data from BLIMPS, and MOTIFS analyses provide furthercorroborative evidence that SEQ ID NO:23 is an nm23 family NDP kinase.SEQ ID NO:1-10, SEQ ID NO:12-14, SEQ ID NO:16-22, and SEQ ID NO:24-26were analyzed and annotated in a similar manner. The algorithms andparameters for the analysis of SEQ ID NO:1-26 are described in Table 7.

[0127] As shown in Table 4, the full length polynucleotide sequences ofthe present invention were assembled using cDNA sequences or coding(exon) sequences derived from genomic DNA, or any combination of thesetwo types of sequences. Column 1 lists the polynucleotide sequenceidentification number (Polynucleotide SEQ ID NO:), the correspondingIncyte polynucleotide consensus sequence number (Incyte ID) for eachpolynucleotide of the invention, and the length of each polynucleotidesequence in basepairs. Column 2 shows the nucleotide start (5′) and stop(3′) positions of the cDNA and/or genomic sequences used to assemble thefull length polynucleotide sequences of the invention, and of fragmentsof the polynucleotide sequences which are useful, for example, inhybridization or amplification technologies that identify SEQ IDNO:27-52 or that distinguish between SEQ ID NO:27-52 and relatedpolynucleotide sequences.

[0128] The polynucleotide fragments described in Column 2 of Table 4 mayrefer specifically, for example, to Incyte cDNAs derived fromtissue-specific cDNA libraries or from pooled cDNA libraries.Alternatively, the polynucleotide fragments described in column 2 mayrefer to GenBank cDNAs or ESTs which contributed to the assembly of thefull length polynucleotide sequences. In addition, the polynucleotidefragments described in column 2 may identify sequences derived from theENSEMBL (The Sanger Centre, Cambridge, UK) database (i.e., thosesequences including the designation “ENST”). Alternatively, thepolynucleotide fragments described in column 2 may be derived from theNCBI RefSeq Nucleotide Sequence Records Database (i.e., those sequencesincluding the designation “NM” or “NT” ) or the NCBI RetSeq ProteinSequence Records (i.e., those sequences including the designation “NP”). Alternatively, the polynucleotide fragments described in column 2 mayrefer to assemblages of both cDNA and Genscan-predicted exons broughttogether by an “exon stitching” algorithm. For example, a polynucleotidesequence identified as FL_XXXXXX_N_(1—)N_(2—)YYYYY_N_(3—)N₄ represents a“stitched” sequence in which XXXXXX is the identification number of thecluster of sequences to which the algorithm was applied, and YYYYY isthe number of the prediction generated by the algorithm, andN_(1,2,3 . . .) , if present, represent specific exons that may havebeen manually edited during analysis (See Example V). Alternatively, thepolynucleotide fragments in column 2 may refer to assemblages of exonsbrought together by an “exon-stretching” algorithm. For example, apolynucleotide sequence identified as FLXXXXXX_gAAAAA_gBBBBB_(—)1_N is a“stretched” sequence, with XXXXXX being the Incyte projectidentification number, gAAAAA being the GenBank identification number ofthe human genomic sequence to which the “exon-stretching” algorithm wasapplied, gBBBBB being the GenBank identification number or NCBI RefSeqidentification number of the nearest GenBank protein homolog, and Nreferring to specific exons (See Example V). In instances where a RefSeqsequence was used as a protein homolog for the “exon-stretching”algorithm, a RefSeq identifier (denoted by “NM,” “NP,” or “NT”) may beused in place of the GenBank identifier (i.e., gBBBBB).

[0129] Alternatively, a prefix identifies component sequences that werehand-edited, predicted from genomic DNA sequences, or derived from acombination of sequence analysis methods. The following Table listsexamples of component sequence prefixes and corresponding sequenceanalysis methods associated with the prefixes (see Example IV andExample V). Prefix Type of analysis and/or examples of programs GNN,GFG, Exon prediction from genomic sequences using, for ENST example,GENSCAN (Stanford University, CA, USA) or FGENES (Computer GenomicsGroup, The Sanger Centre, Cambridge, UK) GBI Hand-edited analysis ofgenomic sequences. FL Stitched or stretched genomic sequences (seeExample V). INCY Full length transcript and exon prediction from mappingof EST sequences to the genome. Genomic location and EST compositiondata are combined to predict the exons and resulting transcript.

[0130] In some cases, Incyte cDNA coverage redundant with the sequencecoverage shown in Table 4 was obtained to confirm the final consensuspolynucleotide sequence, but the relevant Incyte cDNA identificationnumbers are not shown.

[0131] Table 5 shows the representative cDNA libraries for those fulllength polynucleotide sequences which were assembled using Incyte cDNAsequences. The representative cDNA library is the Incyte cDNA librarywhich is most frequently represented by the Incyte cDNA sequences whichwere used to assemble and confirm the above polynucleotide sequences.The tissues and vectors which were used to construct the cDNA librariesshown in Table 5 are described in Table 6.

[0132] The invention also encompasses MDDT variants. A preferred MDDTvariant is one which has at least about 80%, or alternatively at leastabout 90%, or even at least about 95% amino acid sequence identity tothe MDDT amino acid sequence, and which contains at least one functionalor structural characteristic of MDDT.

[0133] The invention also encompasses polynucleotides which encode MDDT.In a particular embodiment, the invention encompasses a polynucleotidesequence comprising a sequence selected from the group consisting of SEQID NO:27-52, which encodes MDDT. The polynucleotide sequences of SEQ IDNO:27-52, as presented in the Sequence Listing, embrace the equivalentRNA sequences, wherein occurrences of the nitrogenous base thymine arereplaced with uracil, and the sugar backbone is composed of riboseinstead of deoxyribose.

[0134] The invention also encompasses a variant of a polynucleotidesequence encoding MDDT. In particular, such a variant polynucleotidesequence will have at least about 70%, or alternatively at least about85%, or even at least about 95% polynucleotide sequence identity to thepolynucleotide sequence encoding MDDT. A particular aspect of theinvention encompasses a variant of a polynucleotide sequence comprisinga sequence selected from the group consisting of SEQ ID NO:27-52 whichhas at least about 70%, or alternatively at least about 85%, or even atleast about 95% polynucleotide sequence identity to a nucleic acidsequence selected from the group consisting of SEQ ID NO:27-52. Any oneof the polynucleotide variants described above can encode an amino acidsequence which contains at least one functional or structuralcharacteristic of MDDT.

[0135] In addition, or in the alternative, a polynucleotide variant ofthe invention is a splice variant of a polynucleotide sequence encodingMDDT. A splice variant may have portions which have significant sequenceidentity to the polynucleotide sequence encoding MDDT, but willgenerally have a greater or lesser number of polynucleotides due toadditions or deletions of blocks of sequence arising from alternatesplicing of exons during MRNA processing. A splice variant may have lessthan about 70%, or alternatively less than about 60%, or alternativelyless than about 50% polynucleotide sequence identity to thepolynucleotide sequence encoding MDDT over its entire length; however,portions of the splice variant will have at least about 70%, oralternatively at least about 85%, or alternatively at least about 95%,or alternatively 100% polynucleotide sequence identity to portions ofthe polynucleotide sequence encoding MDDT. For example, a polynucleotidecomprising a sequence of SEQ ID NO:52 is a splice variant of apolynucleotide comprising a sequence of SEQ ID NO:35. Any one of thesplice variants described above can encode an amino acid sequence whichcontains at least one functional or structural characteristic of MDDT.

[0136] It will be appreciated by those skilled in the art that as aresult of the degeneracy of the genetic code, a multitude ofpolynucleotide sequences encoding MDDT, some bearing minimal similarityto the polynucleotide sequences of any known and naturally occurringgene, may be produced. Thus, the invention contemplates each and everypossible variation of polynucleotide sequence that could be made byselecting combinations based on possible codon choices. Thesecombinations are made in accordance with the standard triplet geneticcode as applied to the polynucleotide sequence of naturally occurringMDDT, and all such variations are to be considered as being specificallydisclosed.

[0137] Although nucleotide sequences which encode MDDT and its variantsare generally capable of hybridizing to the nucleotide sequence of thenaturally occurring MDDT under appropriately selected conditions ofstringency, it may be advantageous to produce nucleotide sequencesencoding MDDT or its derivatives possessing a substantially differentcodon usage, e.g., inclusion of non-naturally occurring codons. Codonsmay be selected to increase the rate at which expression of the peptideoccurs in a particular prokaryotic or eukaryotic host in accordance withthe frequency with which particular codons are utilized by the host.Other reasons for substantially altering the nucleotide sequenceencoding MDDT and its derivatives without altering the encoded aminoacid sequences include the production of RNA transcripts having moredesirable properties, such as a greater half-life, than transcriptsproduced from the naturally occurring sequence.

[0138] The invention also encompasses production of DNA sequences whichencode MDDT and MDDT derivatives, or fragments thereof, entirely bysynthetic chemistry. After production, the synthetic sequence may beinserted into any of the many available expression vectors and cellsystems using reagents well known in the art. Moreover, syntheticchemistry may be used to introduce mutations into a sequence encodingMDDT or any fragment thereof.

[0139] Also encompassed by the invention are polynucleotide sequencesthat are capable of hybridizing to the claimed polynucleotide sequences,and, in particular, to those shown in SEQ ID NO:27-52 and fragmentsthereof under various conditions of stringency. (See, e.g., Wahl, G. M.and S. L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A. R.(1987) Methods Enzymol. 152:507-511.) Hybridization conditions,including annealing and wash conditions, are described in “Definitions.”

[0140] Methods for DNA sequencing are well known in the art and may beused to practice any of the embodiments of the invention. The methodsmay employ such enzymes as the Klenow fragment of DNA polymerase I,SEQUENASE (US Biochemical, Cleveland Ohio), Taq polymerase (AppliedBiosystems), thermostable T7 polymerase (Amersham Pharmacia Biotech,Piscataway N.J.), or combinations of polymerases and proofreadingexonucleases such as those found in the ELONGASE amplification system(Life Technologies, Gaithersburg Md.). Preferably, sequence preparationis automated with machines such as the MICROLAB 2200 liquid transfersystem (Hamilton, Reno Nev.), PTC200 thermal cycler (MJ Research,Watertown Mass.) and ABI CATALYST 800 thermal cycler (AppliedBiosystems). Sequencing is then carried out using either the ABI 373 or377 DNA sequencing system (Applied Biosystems), the MEGABACE 1000 DNAsequencing system (Molecular Dynamics, Sunnyvale Calif.), or othersystems known in the art. The resulting sequences are analyzed using avariety of algorithms which are well known in the art. (See, e.g.,Ausubel, F. M. (1997) Short Protocols in Molecular Biology, John Wiley &Sons, New York N.Y., unit 7.7; Meyers, R. A. (1995) Molecular Biologyand Biotechnology, Wiley VCH, New York N.Y., pp. 856-853.)

[0141] The nucleic acid sequences encoding MDDT may be extendedutilizing a partial nucleotide sequence and employing various PCR-basedmethods known in the art to detect upstream sequences, such as promotersand regulatory elements. For example, one method which may be employed,

ligations may be used to insert an engineered double-stranded sequenceinto a region of unknown sequence before performing PCR. Other methodswhich may be used to retrieve unknown sequences are known in the art.(See, e.g., Parker, J. D. et al. (1991) Nucleic Acids Res.19:3055-3060). Additionally, one may use PCR, nested primers, andPROMOTERFINDER libraries (Clontech, Palo Alto Calif.) to walk genomicDNA. This procedure avoids the need to screen libraries and is useful infinding intron/exon junctions. For all PCR-based methods, primers may bedesigned using commercially available software, such as OLIGO 4.06primer analysis software (National Biosciences, Plymouth Minn.) oranother appropriate program, to be about 22 to 30 nucleotides in length,to have a GC content of about 50% or more, and to anneal to the templateat temperatures of about 68° C. to 72° C.

[0142] When screening for full length cDNAs, it is preferable to uselibraries that have been size-selected to include larger cDNAs. Inaddition, random-primed libraries, which often include sequencescontaining the 5′ regions of genes, are preferable for situations inwhich an oligo d(T) library does not yield a full-length cDNA. Genomiclibraries may be useful for extension of sequence into 5′non-transcribed regulatory regions.

[0143] Capillary electrophoresis systems which are commerciallyavailable may be used to analyze the size or confirm the nucleotidesequence of sequencing or PCR products. In particular, capillarysequencing may employ flowable polymers for electrophoretic separation,four different nucleotide-specific, laser-stimulated fluorescent dyes,and a charge coupled device camera for detection of the emittedwavelengths. Output/light intensity may be converted to electricalsignal using appropriate software (e.g., GENOTYPER and SEQUENCENAVIGATOR, Applied Biosystems), and the entire process from loading ofsamples to computer analysis and electronic data display may be computercontrolled. Capillary electrophoresis is especially preferable forsequencing small DNA fragments which may be present in limited amountsin a particular sample.

[0144] In another embodiment of the invention, polynucleotide sequencesor fragments thereof which encode MDDT may be cloned in recombinant DNAmolecules that direct expression of MDDT, or fragments or functionalequivalents thereof, in appropriate host cells. Due to the inherentdegeneracy of the genetic code, other DNA sequences which encodesubstantially the same or a functionally equivalent amino acid sequencemay be produced and used to express MDDT.

[0145] The nucleotide sequences of the present invention can beengineered using methods generally known in the art in order to alterMDDT-encoding sequences for a variety of purposes including, but notlimited to, modification of the cloning, processing, and/or expressionof the gene product. DNA shuffling by random fragmentation and PCRreassembly of gene fragments and synthetic oligonucleotides may be usedto engineer the nucleotide sequences. For example,oligonucleotide-mediated site-directed mutagenesis may be used tointroduce mutations that create new restriction sites, alterglycosylation patterns, change codon preference, produce splicevariants, and so forth.

[0146] The nucleotides of the present invention may be subjected to DNAshuffling techniques such as MOLECULARBREEDING (Maxygen Inc., SantaClara Calif.; described in U.S. Pat. No. 5,837,458; Chang, C.-C. et al.(1999) Nat. Biotechnol. 17:793-797; Christians, F. C. et al. (1999) Nat.Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol.14:315-319) to alter or improve the biological properties of MDDT, suchas its biological or enzymatic activity or its ability to bind to othermolecules or compounds. DNA shuffling is a process by which a library ofgene variants is produced using PCR-mediated recombination of genefragments. The library is then subjected to selection or screeningprocedures that identify those gene variants with the desiredproperties. These preferred variants may then be pooled and furthersubjected to recursive rounds of DNA shuffling and selection/screening.Thus, genetic diversity is created through “artificial” breeding andrapid molecular evolution. For example, fragments of a single genecontaining random point mutations may be recombined, screened, and thenreshuffled until the desired properties are optimized. Alternatively,fragments of a given gene may be recombined with fragments of homologousgenes in the same gene family, either from the same or differentspecies, thereby maximizing the genetic diversity of multiple naturallyoccurring genes in a directed and controllable manner.

[0147] In another embodiment, sequences encoding MDDT may besynthesized, in whole or in part, using chemical methods well known inthe art. (See, e.g., Caruthers, M. H. et al. (1980) Nucleic Acids Symp.Ser. 7:215-223; and Horn, T. et al. (1980) Nucleic Acids Symp. Ser.7:225-232.) Alternatively, MDDT itself or a fragment thereof may besynthesized using chemical methods. For example, peptide synthesis canbe performed using various solution-phase or solid-phase techniques.(See, e.g., Creighton, T. (1984) Proteins, Structures and MolecularProperties, W H Freeman, New York N.Y., pp. 55-60; and Roberge, J. Y. etal. (1995) Science 269:202-204.) Automated synthesis may be achievedusing the ABI 431A peptide synthesizer (Applied Biosystems).Additionally, the amino acid sequence of MDDT, or any part thereof, maybe altered during direct synthesis and/or combined with sequences fromother proteins, or any part thereof, to produce a variant polypeptide ora polypeptide having a sequence of a naturally occurring polypeptide.

[0148] The peptide may be substantially purified by preparative highperformance liquid chromatography. (See, e.g., Chiez, R. M. and F. Z.Regnier (1990) Methods Enzymol. 182:392-421.) The composition of thesynthetic peptides may be confirmed by amino acid analysis or bysequencing. (See, e.g., Creighton, supra, pp. 28-53.)

[0149] In order to express a biologically active MDDT, the nucleotidesequences encoding MDDT or derivatives thereof may be inserted into anappropriate expression vector, i.e., a vector which contains thenecessary elements for transcriptional and translational control of theinserted coding sequence in a suitable host. These elements includeregulatory sequences, such as enhancers, constitutive and induciblepromoters, and 5′ and 3′ untranslated regions in the vector and inpolynucleotide sequences encoding MDDT. Such elements may vary in theirstrength and specificity. Specific initiation signals may also be usedto achieve more efficient translation of sequences encoding MDDT. Suchsignals include the ATG initiation codon and adjacent sequences, e.g.the Kozak sequence. In cases where sequences encoding MDDT and itsinitiation codon and upstream regulatory sequences are inserted into theappropriate expression vector, no additional transcriptional ortranslational control signals may be needed. However, in cases whereonly coding sequence, or a fragment thereof, is inserted, exogenoustranslational control signals including an in-frame ATG initiation codonshould be provided by the vector. Exogenous translational elements andinitiation codons may be of various origins, both natural and synthetic.The efficiency of expression may be enhanced by the inclusion ofenhancers appropriate for the particular host cell system used. (See,e.g., Scharf, D. et al. (1994) Results Probl. Cell Differ. 20:125-162.)

[0150] Methods which are well known to those skilled in the art may beused to construct expression vectors containing sequences encoding MDDTand appropriate transcriptional and translational control elements.These methods include in vitro recombinant DNA techniques, synthetictechniques, and in vivo genetic recombination. (See, e.g., Sambrook, J.et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring HarborPress, Plainview N.Y., ch. 4, 8, and 16-17; Ausubel, F. M. et al. (1995)Current Protocols in Molecular Biology, John Wiley & Sons, New YorkN.Y., ch. 9, 13, and 16.)

[0151] A variety of expression vector/host systems may be utilized tocontain and express sequences encoding MDDT. These include, but are notlimited to, microorganisms such as bacteria transformed with recombinantbacteriophage, plasmid, or cosmid DNA expression vectors; yeasttransformed with yeast expression vectors; insect cell systems infectedwith viral expression vectors (e.g., baculovirus); plant cell systemstransformed with viral expression vectors (e.g., cauliflower mosaicvirus, CaMV, or tobacco mosaic virus, TMV) or with bacterial expressionvectors (e.g., Ti or pBR322 plasmids); or animal cell systems. (See,e.g., Sambrook, supra; Ausubel, supra; Van Heeke, G. and S. M. Schuster(1989) J. Biol. Chem. 264:5503-5509; Engelhard, E. K. et al. (1994)Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum.Gene Ther. 7:1937-1945; Takamatsu, N. (1987) EMBO J. 6:307-311; TheMcGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, NewYork N.Y., pp. 191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad.Sci. USA 81:3655-3659; and Harrington, J. J. et al. (1997) Nat. Genet.15:345-355.) Expression vectors derived from retroviruses, adenoviruses,or herpes or vaccinia viruses, or from various bacterial plasmids, maybe used for delivery of nucleotide sequences to the targeted organ,tissue, or cell population. (See, e.g., Di Nicola, M. et al. (1998)Cancer Gen. Ther. 5(6):350-356; Yu, M. et al. (1993) Proc. Natl. Acad.Sci. USA 90(13):6340-6344; Buller, R. M. et al. (1985) Nature317(6040):813-815; McGregor, D. P. et al. (1994) Mol. Immunol.31(3):219-226; and Verma, I. M. and N. Somia (1997) Nature 389:239-242.)The invention is not limited by the host cell employed.

[0152] In bacterial systems, a number of cloning and expression vectorsmay be selected depending upon the use intended for polynucleotidesequences encoding MDDT. For example, routine cloning, subcloning, andpropagation of polynucleotide sequences encoding MDDT can be achievedusing a multifunctional E. coli vector such as PBLUESCRIPT (Stratagene,La Jolla Calif.) or PSPORT1 plasmid (Life Technologies). Ligation ofsequences encoding MDDT into the vector's multiple cloning site disruptsthe lacZ gene, allowing a colorimetric screening procedure foridentification of transformed bacteria containing recombinant molecules.In addition, these vectors may be useful for in vitro transcription,dideoxy sequencing, single strand rescue with helper phage, and creationof nested deletions in the cloned sequence. (See, e.g., Van Heeke, G.and S. M. Schuster (1989) J. Biol. Chem. 264:5503-5509.) When largequantities of MDDT are needed, e.g. for the production of antibodies,vectors which direct high level expression of MDDT may be used. Forexample, vectors containing the strong, inducible SP6 or T7bacteriophage promoter may be used.

[0153] Yeast expression systems may be used for production of MDDT. Anumber of vectors containing constitutive or inducible promoters, suchas alpha factor, alcohol oxidase, and PGH promoters, may be used in theyeast Saccharomyces cerevisiae or Pichia pastoris. In addition, suchvectors direct either the secretion or intracellular retention ofexpressed proteins and enable integration of foreign sequences into thehost genome for stable propagation. (See, e.g., Ausubel, 1995, supra;Bitter, G. A. et al. (1987) Methods Enzymol. 153:516-544; and Scorer, C.A. et al. (1994) Bio/Technology 12:181-184.)

[0154] Plant systems may also be used for expression of MDDT.Transcription of sequences encoding MDDT may be driven by viralpromoters, e.g., the 35S and 19S promoters of CaMV used alone or incombination with the omega leader sequence from TMV (Takamatsu, N.(1987) EMBO J. 6:307-311). Alternatively, plant promoters such as thesmall subunit of RUBISCO or heat shock promoters may be used. (See,e.g., Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; Broglie, R. et al.(1984) Science 224:838-843; and Winter, J. et al. (1991) Results Probl.Cell Differ. 17:85-105.) These constructs can be introduced into plantcells by direct DNA transformation or pathogen-mediated transfection.(See, e.g., The McGraw Hill Yearbook of Science and Technology (1992)McGraw Hill, New York N.Y., pp. 191-196.)

[0155] In mammalian cells, a number of viral-based expression systemsmay be utilized. In cases where an adenovirus is used as an expressionvector, sequences encoding MDDT may be ligated into an adenovirustranscription/translation complex consisting of the late promoter andtripartite leader sequence. Insertion in a non-essential E1 or E3 regionof the viral genome may be used to obtain infective virus whichexpresses MDDT in host cells. (See, e.g., Logan, J. and T. Shenk (1984)Proc. Natl. Acad. Sci. USA 81:3655-3659.) In addition, transcriptionenhancers, such as the Rous sarcoma virus (RSV) enhancer, may be used toincrease expression in mammalian host cells. SV40 or EBV-based vectorsmay also be used for high-level protein expression.

[0156] Human artificial chromosomes (HACs) may also be employed todeliver larger fragments of DNA than can be contained in and expressedfrom a plasmid. HACs of about 6 kb to 10 Mb are constructed anddelivered via conventional delivery methods (liposomes, polycationicamino polymers, or vesicles) for therapeutic purposes. (See, e.g.,Harrington, J. J. et al. (1997) Nat. Genet. 15:345-355.)

[0157] For long term production of recombinant proteins in mammaliansystems, stable expression of MDDT in cell lines is preferred. Forexample, sequences encoding MDDT can be transformed into cell linesusing expression vectors which may contain viral origins of replicationand/or endogenous expression elements and a selectable marker gene onthe same or on a separate vector. Following the introduction of thevector, cells may be allowed to grow for about 1 to 2 days in enrichedmedia before being switched to selective media. The purpose of theselectable marker is to confer resistance to a selective agent, and itspresence allows growth and recovery of cells which successfully expressthe introduced sequences. Resistant clones of stably transformed cellsmay be propagated using tissue culture techniques appropriate to thecell type.

[0158] Any number of selection systems may be used to recovertransformed cell lines. These include, but are not limited to, theherpes simplex virus thymidine kinase and adeninephosphoribosyltransferase genes, for use in tk and apr cells,respectively. (See, e.g., Wigler, M. et al. (1977) Cell 11:223-232;Lowy, I. et al. (1980) Cell 22:817-823.) Also, antimetabolite,antibiotic, or herbicide resistance can be used as the basis forselection. For example, dhfr confers resistance to methotrexate; neoconfers resistance to the aminoglycosides neomycin and G-418; and alsand pat confer resistance to chlorsulfuron and phosphinotricinacetyltransferase, respectively. (See, e.g., Wigler, M. et al. (1980)Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et al.(1981) J. Mol. Biol. 150:1-14.) Additional selectable genes have beendescribed, e.g., trpB and hisD, which alter cellular requirements formetabolites. (See, e.g., Hartman, S. C. and R. C. Mulligan (1988) Proc.Natl. Acad. Sci. USA 85:8047-8051.) Visible markers, e.g., anthocyanins,green fluorescent proteins (GFP; Clontech), β glucuronidase and itssubstrate β-glucuronide, or luciferase and its substrate luciferin maybe used. These markers can be used not only to identify transformants,but also to quantify the amount of transient or stable proteinexpression attributable to a specific vector system. (See, e.g., Rhodes,C. A. (1995) Methods Mol. Biol. 55:121-131.)

[0159] Although the presence/absence of marker gene expression suggeststhat the gene of interest is also present, the presence and expressionof the gene may need to be confirmed. For example, if the sequenceencoding MDDT is inserted within a marker gene sequence, transformedcells containing sequences encoding MDDT can be identified by theabsence of marker gene function. Alternatively, a marker gene can beplaced in tandem with a sequence encoding MDDT under the control of asingle promoter. Expression of the marker gene in response to inductionor selection usually indicates expression of the tandem gene as well.

[0160] In general, host cells that contain the nucleic acid sequenceencoding MDDT and that express MDDT may be identified by a variety ofprocedures known to those of skill in the art. These procedures include,but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCRamplification, and protein bioassay or immunoassay techniques whichinclude membrane, solution, or chip based technologies for the detectionand/or quantification of nucleic acid or protein sequences.

[0161] Immunological methods for detecting and measuring the expressionof MDDT using either specific polyclonal or monoclonal antibodies areknown in the art. Examples of such techniques include enzyme-linkedimmunosorbent assays (ELISAs), radioimmunoassays (RIAs), andfluorescence activated cell sorting (FACS). A two-site, monoclonal-basedimmunoassay utilizing monoclonal antibodies reactive to twonon-interfering epitopes on MDDT is preferred, but a competitive bindingassay may be employed. These and other assays are well known in the art.(See, e.g., Hampton, R. et al. (1990) Serological Methods, a LaboratoryManual, APS Press, St. Paul Minn., Sect. IV; Coligan, J. E. et al.(1997) Current Protocols in Immunology, Greene Pub. Associates andWiley-lnterscience, New York N.Y.; and Pound, J. D. (1998)Immunochemical Protocols, Humana Press, Totowa N.J.)

[0162] A wide variety of labels and conjugation techniques are known bythose skilled in the art and may be used in various nucleic acid andamino acid assays. Means for producing labeled hybridization or PCRprobes for detecting sequences related to polynucleotides encoding MDDTinclude oligolabeling, nick translation, end-labeling, or PCRamplification using a labeled nucleotide. Alternatively, the sequencesencoding MDDT, or any fragments thereof, may be cloned into a vector forthe production of an mRNA probe. Such vectors are known in the art, arecommercially available, and may be used to synthesize RNA probes invitro by addition of an appropriate RNA polymerase such as T7, T3, orSP6 and labeled nucleotides. These procedures may be conducted using avariety of commercially available kits, such as those provided byAmersham Pharmacia Biotech, Promega (Madison Wis.), and US Biochemical.Suitable reporter molecules or labels which may be used for ease ofdetection include radionuclides, enzymes, fluorescent, chemiluminescent,or chromogenic agents, as well as substrates, cofactors, inhibitors,magnetic particles, and the like.

[0163] Host cells transformed with nucleotide sequences encoding MDDTmay be cultured under conditions suitable for the expression andrecovery of the protein from cell culture. The protein produced by atransformed cell may be secreted or retained intracellularly dependingon the sequence and/or the vector used. As will be understood by thoseof skill in the art, expression vectors containing polynucleotides whichencode MDDT may be designed to contain signal sequences which directsecretion of MDDT through a prokaryotic or eukaryotic cell membrane.

[0164] In addition, a host cell strain may be chosen for its ability tomodulate expression of the inserted sequences or to process theexpressed protein in the desired fashion. Such modifications of thepolypeptide include, but are not limited to, acetylation, carboxylation,glycosylation, phosphorylation, lipidation, and acylation.Post-translational processing which cleaves a “prepro” or “pro” form ofthe protein may also be used to specify protein targeting, folding,and/or activity. Different host cells which have specific cellularmachinery and characteristic mechanisms for post-translationalactivities (e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available fromthe American Type Culture Collection (ATCC, Manassas Va.) and may bechosen to ensure the correct modification and processing of the foreignprotein.

[0165] In another embodiment of the invention, natural, modified, orrecombinant nucleic acid sequences encoding MDDT may be ligated to aheterologous sequence resulting in translation of a fusion protein inany of the aforementioned host systems. For example, a chimeric MDDTprotein containing a heterologous moiety that can be recognized by acommercially available antibody may facilitate the screening of peptidelibraries for inhibitors of MDDT activity. Heterologous protein andpeptide moieties may also facilitate purification of fusion proteinsusing commercially available affinity matrices. Such moieties include,but are not limited to, glutathione S-transferase (GST), maltose bindingprotein (MBP), thioredoxin (Trx), calmodulin binding peptide (CBP),6-His, FLAG, c-myc, and hemagglutinin (HA). GST, MBP, Trx, CBP, and6-His enable purification of their cognate fusion proteins onimmobilized glutathione, maltose, phenylarsine oxide, calmodulin, andmetal-chelate resins, respectively. FLAG, c-myc, and hemagglutinin (HA)enable immunoaffinity purification of fusion proteins using commerciallyavailable monoclonal and polyclonal antibodies that specificallyrecognize these epitope tags. A fusion protein may also be engineered tocontain a proteolytic cleavage site located between the MDDT encodingsequence and the heterologous protein sequence, so that MDDT may becleaved away from the heterologous moiety following purification.Methods for fusion protein expression and purification are discussed inAusubel (1995, supra, ch. 10). A variety of commercially available kitsmay also be used to facilitate expression and purification of fusionproteins.

[0166] In a further embodiment of the invention, synthesis ofradiolabeled MDDT may be achieved in vitro using the TNT rabbitreticulocyte lysate or wheat germ extract system (Promega). Thesesystems couple transcription and translation of protein-coding sequencesoperably associated with the T7, T3, or SP6 promoters. Translation takesplace in the presence of a radiolabeled amino acid precursor, forexample, ³⁵S-methionine.

[0167] MDDT of the present invention or fragments thereof may be used toscreen for compounds that specifically bind to MDDT. At least one and upto a plurality of test compounds may be screened for specific binding toMDDT. Examples of test compounds include antibodies, oligonucleotides,proteins (e.g., receptors), or small molecules.

[0168] In one embodiment, the compound thus identified is closelyrelated to the natural ligand of MDDT, e.g., a ligand or fragmentthereof, a natural substrate, a structural or functional mimetic, or anatural binding partner. (See, e.g., Coligan, J. E. et al. (1991)Current Protocols in Immunology 1(2): Chapter 5.) Similarly, thecompound can be closely related to the natural receptor to which MDDTbinds, or to at least a fragment of the receptor, e.g., the ligandbinding site. In either case, the compound can be rationally designedusing known techniques. In one embodiment, screening for these compoundsinvolves producing appropriate cells which express MDDT, either as asecreted protein or on the cell membrane. Preferred cells include cellsfrom mammals, yeast, Drosophila, or E. coli. Cells expressing MDDT orcell membrane fractions which contain MDDT are then contacted with atest compound and binding, stimulation, or inhibition of activity ofeither MDDT or the compound is analyzed.

[0169] An assay may simply test binding of a test compound to thepolypeptide, wherein binding is detected by a fluorophore, radioisotope,enzyme conjugate, or other detectable label. For example, the assay maycomprise the steps of combining at least one test compound with MDDT,either in solution or affixed to a solid support, and detecting thebinding of MDDT to the compound. Alternatively, the assay may detect ormeasure binding of a test compound in the presence of a labeledcompetitor. Additionally, the assay may be carried out using cell-freepreparations, chemical libraries, or natural product mixtures, and thetest compound(s) may be free in solution or affixed to a solid support.

[0170] MDDT of the present invention or fragments thereof may be used toscreen for compounds that modulate the activity of MDDT. Such compoundsmay include agonists, antagonists, or partial or inverse agonists. Inone embodiment, an assay is performed under conditions permissive forMDDT activity, wherein MDDT is combined with at least one test compound,and the activity of MDDT in the presence of a test compound is comparedwith the activity of MDDT in the absence of the test compound. A changein the activity of MDDT in the presence of the test compound isindicative of a compound that modulates the activity of MDDT.Alternatively, a test compound is combined with an in vitro or cell-freesystem comprising MDDT under conditions suitable for MDDT activity, andthe assay is performed. In either of these assays, a test compound whichmodulates the activity of MDDT may do so indirectly and need not come indirect contact with the test compound. At least one and up to aplurality of test compounds may be screened.

[0171] In another embodiment, polynucleotides encoding MDDT or theirmammalian homologs may be “knocked out” in an animal model system usinghomologous recombination in embryonic stem (ES) cells. Such techniquesare well known in the art and are useful for the generation of animalmodels of human disease. (See, e.g., U.S. Pat. No. 5,175,383 and U.S.Pat. No. 5,767,337.) For example, mouse ES cells, such as the mouse129/SvJ cell line, are derived from the early mouse embryo and grown inculture. The ES cells are transformed with a vector containing the geneof interest disrupted by a marker gene, e.g., the neomycinphosphotransferase gene (neo; Capecchi, M. R. (1989) Science244:1288-1292). The vector integrates into the corresponding region ofthe host genome by homologous recombination. Alternatively, homologousrecombination takes place using the Cre-loxP system to knockout a geneof interest in a tissue- or developmental stage-specific manner (Marth,J. D. (1996) Clin. Invest. 97:1999-2002; Wagner, K. U. et al. (1997)Nucleic Acids Res. 25:4323-4330). Transformed ES cells are identifiedand microinjected into mouse cell blastocysts such as those from theC57BL/6 mouse strain. The blastocysts are surgically transferred topseudopregnant dams, and the resulting chimeric progeny are genotypedand bred to produce heterozygous or homozygous strains. Transgenicanimals thus generated may be tested with potential therapeutic or toxicagents.

[0172] Polynucleotides encoding MDDT may also be manipulated in vitro inES cells derived from human blastocysts. Human ES cells have thepotential to differentiate into at least eight separate cell lineagesincluding endoderm, mesoderm, and ectodermal cell types. These celllineages differentiate into, for example, neural cells, hematopoieticlineages, and cardiomyocytes (Thomson, J. A. et al. (1998) Science282:1145-1147).

[0173] Polynucleotides encoding MDDT can also be used to create“knockin” humanized animals (pigs) or transgenic animals (mice or rats)to model human disease. With knockin technology, a region of apolynucleotide encoding MDDT is injected into animal ES cells, and theinjected sequence integrates into the animal cell genome. Transformedcells are injected into blastulae, and the blastulae are implanted asdescribed above. Transgenic progeny or inbred lines are studied andtreated with potential pharmaceutical agents to obtain information ontreatment of a human disease. Alternatively, a mammal inbred tooverexpress MDDT, e.g., by secreting MDDT in its milk, may also serve asa convenient source of that protein (Janne, J. et al. (1998) Biotechnol.Annu. Rev. 4:55-74).

[0174] THERAPEUTICS

[0175] Chemical and structural similarity, e.g., in the context ofsequences and motifs, exists between regions of MDDT and molecules fordisease detection and treatment. In addition, examples of tissuesexpressing MDDT can be found in Table 6. Therefore, MDDT appears to playa role in cell proliferative, autoimmune/inflammatory, developmental,and neurological disorders. In the treatment of disorders associatedwith increased MDDT expression or activity, it is desirable to decreasethe expression or activity of MDDT. In the treatment of disordersassociated with decreased MDDT expression or activity, it is desirableto increase the expression or activity of MDDT.

[0176] Therefore, in one embodiment, MDDT or a fragment or derivativethereof may be administered to a subject to treat or prevent a disorderassociated with decreased expression or activity of MDDT. Examples ofsuch disorders include, but are not limited to, a cell proliferativedisorder such as actinic keratosis, arteriosclerosis, atherosclerosis,bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD),myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera,psoriasis, primary thrombocythemia, and cancers includingadenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma,teratocarcinoma, and, in particular, cancers of the adrenal gland,bladder, bone, bone marrow, brain, breast, cervix, gall bladder,ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle,ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin,spleen, testis, thymus, thyroid, and uterus; an autoimmune/inflammatorydisorder such as acquired immunodeficiency syndrome (AIDS), Addison'sdisease, adult respiratory distress syndrome, allergies, ankylosingspondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmunehemolytic anemia, autoimmune thyroiditis, autoimmunepolyendocrinopathy-candidiasis-ectodermal dystrophy (APECED),bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopicdermatitis, dermatomyositis, diabetes mellitus, emphysema, episodiclymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythemanodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome,gout, Graves' disease, Hashimoto's thyroiditis, hypereosinophilia,irritable bowel syndrome, multiple sclerosis, myasthenia gravis,myocardial or pericardial inflammation, osteoarthritis, osteoporosis,pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoidarthritis, scleroderma, Sjögren's syndrome, systemic anaphylaxis,systemic lupus erythematosus, systemic sclerosis, thrombocytopenicpurpura, ulcerative colitis, uveitis, Werner syndrome, complications ofcancer, hemodialysis, and extracorporeal circulation, viral, bacterial,fungal, parasitic, protozoal, and helminthic infections, and trauma; adevelopmental disorder such as renal tubular acidosis, anemia, Cushing'ssyndrome, achondroplastic dwarfism, Duchenne and Becker musculardystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms' tumor,aniridia, genitourinary abnormalities, and mental retardation),Smith-Magenis syndrome, myelodysplastic syndrome, hereditarymucoepithelial dysplasia, hereditary keratodermas, hereditaryneuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis,hypothyroidism, hydrocephalus, seizure disorders such as Syndenham'schorea and cerebral palsy, spina bifida, anencephaly,craniorachischisis, congenital glaucoma, cataract, and sensorineuralhearing loss; and a neurological disorder such as epilepsy, ischemiccerebrovascular disease, stroke, cerebral neoplasms, Alzheimer'sdisease, Pick's disease, Huntington's disease, dementia, Parkinson'sdisease and other extrapyramidal disorders, amyotrophic lateralsclerosis and other motor neuron disorders, progressive neural muscularatrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosisand other demyelinating diseases, bacterial and viral meningitis, brainabscess, subdural empyema, epidural abscess, suppurative intracranialthrombophlebitis, myelitis and radiculitis, viral central nervous systemdisease, prion diseases including kuru, Creutzfeldt-Jakob disease, andGerstmann-Straussler-Scheinker syndrome, fatal familial insomnia,nutritional and metabolic diseases of the nervous system,neurofibromatosis, tuberous sclerosis, cerebelloretinalhemangioblastomatosis, encephalotrigeminal syndrome, mental retardationand other developmental disorders of the central nervous systemincluding Down syndrome, cerebral palsy, neuroskeletal disorders,autonomic nervous system disorders, cranial nerve disorders, spinal corddiseases, muscular dystrophy and other neuromuscular disorders,peripheral nervous system disorders, dermatomyositis and polymyositis,inherited, metabolic, endocrine, and toxic myopathies, myastheniagravis, periodic paralysis, mental disorders including mood, anxiety,and schizophrenic disorders, seasonal affective disorder (SAD),akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia,dystonias, paranoid psychoses, postherpetic neuralgia, Tourette'sdisorder, progressive supranuclear palsy, corticobasal degeneration, andfamilial frontotemporal dementia.

[0177] In another embodiment, a vector capable of expressing MDDT or afragment or derivative thereof may be administered to a subject to treator prevent a disorder associated with decreased expression or activityof MDDT including, but not limited to, those described above.

[0178] In a further embodiment, a composition comprising a substantiallypurified MDDT in conjunction with a suitable pharmaceutical carrier maybe administered to a subject to treat or prevent a disorder associatedwith decreased expression or activity of MDDT including, but not limitedto, those provided above.

[0179] In still another embodiment, an agonist which modulates theactivity of MDDT may be administered to a subject to treat or prevent adisorder associated with decreased expression or activity of MDDTincluding, but not limited to, those listed above.

[0180] In a further embodiment, an antagonist of MDDT may beadministered to a subject to treat or prevent a disorder associated withincreased expression or activity of MDDT. Examples of such disordersinclude, but are not limited to, those cell proliferative,autoimmune/inflammatory, developmental, and neurological disordersdescribed above. In one aspect, an antibody which specifically bindsMDDT may be used directly as an antagonist or indirectly as a targetingor delivery mechanism for bringing a pharmaceutical agent to cells ortissues which express MDDT.

[0181] In an additional embodiment, a vector expressing the complementof the polynucleotide encoding MDDT may be administered to a subject totreat or prevent a disorder associated with increased expression oractivity of MDDT including, but not limited to, those described above.

[0182] In other embodiments, any of the proteins, antagonists,antibodies, agonists, complementary sequences, or vectors of theinvention may be administered in combination with other appropriatetherapeutic agents. Selection of the appropriate agents for use incombination therapy may be made by one of ordinary skill in the art,according to conventional pharmaceutical principles. The combination oftherapeutic agents may act synergistically to effect the treatment orprevention of the various disorders described above. Using thisapproach, one may be able to achieve therapeutic efficacy with lowerdosages of each agent, thus reducing the potential for adverse sideeffects.

[0183] An antagonist of MDDT may be produced using methods which aregenerally known in the art. In particular, purified MDDT may be used toproduce antibodies or to screen libraries of pharmaceutical agents toidentify those which specifically bind MDDT. Antibodies to MDDT may alsobe generated using methods that are well known in the art. Suchantibodies may include, but are not limited to, polyclonal, monoclonal,chimeric, and single chain antibodies, Fab fragments, and fragmentsproduced by a Fab expression library. Neutralizing antibodies (i.e.,those which inhibit dimer formation) are generally preferred fortherapeutic use.

[0184] For the production of antibodies, various hosts including goats,rabbits, rats, mice, humans, and others may be immunized by injectionwith MDDT or with any fragment or oligopeptide thereof which hasimmunogenic properties. Depending on the host species, various adjuvantsmay be used to increase immunological response. Such adjuvants include,but are not limited to, Freund's, mineral gels such as aluminumhydroxide, and surface active substances such as lysolecithin, pluronicpolyols, polyanions, peptides, oil emulsions, KLH, and dinitrophenol.Among adjuvants used in humans, BCG (bacilli Calmette-Guerin) andCorynebacterium parvum are especially preferable.

[0185] It is preferred that the oligopeptides, peptides, or fragmentsused to induce antibodies to MDDT have an amino acid sequence consistingof at least about 5 amino acids, and generally will consist of at leastabout 10 amino acids. It is also preferable that these oligopeptides,peptides, or fragments are identical to a portion of the amino acidsequence of the natural protein. Short stretches of MDDT amino acids maybe fused with those of another protein, such as KLH, and antibodies tothe chimeric molecule may be produced.

[0186] Monoclonal antibodies to MDDT may be prepared using any techniquewhich provides for the production of antibody molecules by continuouscell lines in culture. These include, but are not limited to, thehybridoma technique, the human B-cell hybridoma technique, and theEBV-hybridoma technique. (See, e.g., Kohler, G. et al. (1975) Nature256:495-497; Kozbor, D. et al. (1985) J. Immunol. Methods 81:31-42;Cote, R. J. et al. (1983) Proc. Natl. Acad. Sci. USA 80:2026-2030; andCole, S. P. et al. (1984) Mol. Cell Biol. 62:109-120.)

[0187] In addition, techniques developed for the production of “chimericantibodies,” such as the splicing of mouse antibody genes to humanantibody genes to obtain a molecule with appropriate antigen specificityand biological activity, can be used. (See, e.g., Morrison, S. L. et al.(1984) Proc. Natl. Acad. Sci. USA 81:6851-6855; Neuberger, M. S. et al.(1984) Nature 312:604-608; and Takeda, S. et al. (1985) Nature314:452-454.) Alternatively, techniques described for the production ofsingle chain antibodies may be adapted, using methods known in the art,to produce MDDT-specific single chain antibodies. Antibodies withrelated specificity, but of distinct idiotypic composition, may begenerated by chain shuffling from random combinatorial immunoglobulinlibraries. (See, e.g., Burton, D. R. (1991) Proc. Natl. Acad. Sci. USA88:10134-10137.)

[0188] Antibodies may also be produced by inducing in vivo production inthe lymphocyte population or by screening immunoglobulin libraries orpanels of highly specific binding reagents as disclosed in theliterature. (See, e.g., Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci.USA 86:3833-3837; Winter, G. et al. (1991) Nature 349:293-299.)

[0189] Antibody fragments which contain specific binding sites for MDDTmay also be generated. For example, such fragments include, but are notlimited to, F(ab′)₂ fragments produced by pepsin digestion of theantibody molecule and Fab fragments generated by reducing the disulfidebridges of the F(ab′)2 fragments. Alternatively, Fab expressionlibraries may be constructed to allow rapid and easy identification ofmonoclonal Fab fragments with the desired specificity. (See, e.g., Huse,W. D. et al. (1989) Science 246:1275-1281.)

[0190] Various immunoassays may be used for screening to identifyantibodies having the desired specificity. Numerous protocols forcompetitive binding or immunoradiometric assays using either polyclonalor monoclonal antibodies with established specificities are well knownin the art. Such immunoassays typically involve the measurement ofcomplex formation between MDDT and its specific antibody. A two-site,monoclonal-based immunoassay utilizing monoclonal antibodies reactive totwo non-interfering MDDT epitopes is generally used, but a competitivebinding assay may also be employed (Pound, supra).

[0191] Various methods such as Scatchard analysis in conjunction withradioimmunoassay techniques may be used to assess the affinity ofantibodies for MDDT. Affinity is expressed as an association constant,K_(a), which is defined as the molar concentration of MDDT-antibodycomplex divided by the molar concentrations of free antigen and freeantibody under equilibrium conditions. The K_(a) determined for apreparation of polyclonal antibodies, which are heterogeneous in theiraffinities for multiple MDDT epitopes, represents the average affinity,or avidity, of the antibodies for MDDT. The K_(a) determined for apreparation of monoclonal antibodies, which are monospecific for aparticular MDDT epitope, represents a true measure of affinity.High-affinity antibody preparations with K_(a) ranging from about 10⁹ to10¹² L/mole are preferred for use in immunoassays in which theMDDT-antibody complex must withstand rigorous manipulations.Low-affinity antibody preparations with K_(a) ranging from about 10⁶ to10⁷ L/mole are preferred for use in immunopurification and similarprocedures which ultimately require dissociation of MDDT, preferably inactive form, from the antibody (Catty, D. (1988) Antibodies, Volume I: APractical Approach, IRL Press, Washington D.C.; Liddell, J. E. and A.Cryer (1991) A Practical Guide to Monoclonal Antibodies, John Wiley &Sons, New York N.Y.).

[0192] The titer and avidity of polyclonal antibody preparations may befurther evaluated to determine the quality and suitability of suchpreparations for certain downstream applications. For example, apolyclonal antibody preparation containing at least 1-2 mg specificantibody/ml, preferably 5-10 mg specific antibody/ml, is generallyemployed in procedures requiring precipitation of MDDT-antibodycomplexes. Procedures for evaluating antibody specificity, titer, andavidity, and guidelines for antibody quality and usage in variousapplications, are generally available. (See, e.g., Catty, supra, andColigan et al. supra.)

[0193] In another embodiment of the invention, the polynucleotidesencoding MDDT, or any fragment or complement thereof, may be used fortherapeutic purposes. In one aspect, modifications of gene expressioncan be achieved by designing complementary sequences or antisensemolecules (DNA, RNA, PNA, or modified oligonucleotides) to the coding orregulatory regions of the gene encoding MDDT. Such technology is wellknown in the art, and antisense oligonucleotides or larger fragments canbe designed from various locations along the coding or control regionsof sequences encoding MDDT. (See, e.g., Agrawal, S., ed. (1996)Antisense Therapeutics, Humana Press Inc., Totawa N.J.)

[0194] In therapeutic use, any gene delivery system suitable forintroduction of the antisense sequences into appropriate target cellscan be used. Antisense sequences can be delivered intracellularly in theform of an expression plasmid which, upon transcription, produces asequence complementary to at least a portion of the cellular sequenceencoding the target protein. (See, e.g., Slater, J. E. et al. (1998) J.Allergy Clin. Immunol. 102(3):469-475; and Scanlon, K. J. et al. (1995)9(13):1288-1296.) Antisense sequences can also be introducedintracellularly through the use of viral vectors, such as retrovirus andadeno-associated virus vectors. (See, e.g., Miller, A. D. (1990) Blood76:271; Ausubel, supra; Uckert, W. and W. Walther (1994) Pharmacol.Ther. 63(3):323-347.) Other gene delivery mechanisms includeliposome-derived systems, artificial viral envelopes, and other systemsknown in the art. (See, e.g., Rossi, J. J. (1995) Br. Med. Bull.51(1):217-225; Boado, R. J. et al. (1998) J. Pharm. Sci.87(11):1308-1315; and Morris, M. C. et al. (1997) Nucleic Acids Res.25(14):2730-2736.)

[0195] In another embodiment of the invention, polynucleotides encodingMDDT may be used for somatic or germline gene therapy. Gene therapy maybe performed to (i) correct a genetic deficiency (e.g., in the cases ofsevere combined immunodeficiency (SCID))-X1 disease characterized byX-linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science288:669-672), severe combined immunodeficiency syndrome associated withan inherited adenosine deaminase (ADA) deficiency (Blaese, R. M. et al.(1995) Science 270:475-480; Bordignon, C. et al. (1995) Science270:470-475), cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216;Crystal, R. G. et al. (1995) Hum. Gene Therapy 6:643-666; Crystal, R. G.et al. (1995) Hum. Gene Therapy 6:667-703), thalassamias, familialhypercholesterolemia, and hemophilia resulting from Factor VIII orFactor IX deficiencies (Crystal, R. G. (1995) Science 270:404-410;Verma, I. M. and N. Somia (1997) Nature 389:239-242)), (ii) express aconditionally lethal gene product (e.g., in the case of cancers whichresult from unregulated cell proliferation), or (iii) express a proteinwhich affords protection against intracellular parasites (e.g., againsthuman retroviruses, such as human immunodeficiency virus (HIV)(Baltimore, D. (1988) Nature 335:395-396; Poeschla, E. et al. (1996)Proc. Natl. Acad. Sci. USA 93:11395-11399), hepatitis B or C virus (HBV,HCV); fungal parasites, such as Candida albicans and Paracoccidioidesbrasiliensis; and protozoan parasites such as Plasmodium falciparum andTrypanosoma cruzi). In the case where a genetic deficiency in MDDTexpression or regulation causes disease, the expression of MDDT from anappropriate population of transduced cells may alleviate the clinicalmanifestations caused by the genetic deficiency.

[0196] In a further embodiment of the invention, diseases or disorderscaused by deficiencies in MDDT are treated by constructing mammalianexpression vectors encoding MDDT and introducing these vectors bymechanical means into MDDT-deficient cells. Mechanical transfertechnologies for use with cells in vivo or ex vitro include (i) directDNA microinjection into individual cells, (ii) ballistic gold particledelivery, (iii) liposome-mediated transfection, (iv) receptor-mediatedgene transfer, and (v) the use of DNA transposons (Morgan, R. A. and W.F. Anderson (1993) Annu. Rev. Biochem. 62:191-217; Ivics, Z. (1997) Cell91:501-510; Boulay, J-L. and H. Récipon (1998) Curr. Opin. Biotechnol.9:445-450).

[0197] Expression vectors that may be effective for the expression ofMDDT include, but are not limited to, the PCDNA 3.1, EPITAG, PRCCMV2,PREP, PVAX, PCR2-TOPOTA vectors (Invitrogen, Carlsbad Calif.),PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla Calif.), andPTET-OFF, PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo AltoCalif.). MDDT maybe expressed using (i) a constitutively activepromoter, (e.g., from cytomegalovirus (CMV), Rous sarcoma virus (RSV),SV40 virus, thymidine kinase (TK), or β-actin genes), (ii) an induciblepromoter (e.g., the tetracycline-regulated promoter (Gossen, M. and H.Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547-5551; Gossen, M. et al.(1995) Science 268:1766-1769; Rossi, F. M. V. and H. M. Blau (1998)Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REXplasmid (Invitrogen)); the ecdysone-inducible promoter (available in theplasmids PVGRXR and PIND; Invitrogen); the FK506/rapamycin induciblepromoter; or the RU486/mifepristone inducible promoter (Rossi, F. M. V.and H. M. Blau, supra)), or (iii) a tissue-specific promoter or thenative promoter of the endogenous gene encoding MDDT from a normalindividual.

[0198] Commercially available liposome transformation kits (e.g., thePERFECT LIPID TRANSFECTION KIT, available from Invitrogen) allow onewith ordinary skill in the art to deliver polynucleotides to targetcells in culture and require minimal effort to optimize experimentalparameters. In the alternative, transformation is performed using thecalcium phosphate method (Graham, F. L. and A. J. Eb (1973) Virology52:456-467), or by electroporation (Neumann, E. et al. (1982) EMBO J.1:841-845). The introduction of DNA to primary cells requiresmodification of these standardized mammalian transfection protocols.

[0199] In another embodiment of the invention, diseases or disorderscaused by genetic defects with respect to MDDT expression are treated byconstructing a retrovirus vector consisting of (i) the polynucleotideencoding MDDT under the control of an independent promoter or theretrovirus long terminal repeat (LTR) promoter, (ii) appropriate RNApackaging signals, and (iii) a Rev-responsive element (RRE) along withadditional retrovirus cis-acting RNA sequences and coding sequencesrequired for efficient vector propagation. Retrovirus vectors (e.g., PFBand PFBNEO) are commercially available (Stratagene) and are based onpublished data (Riviere, I. et al. (1995) Proc. Natl. Acad. Sci. USA92:6733-6737), incorporated by reference herein. The vector ispropagated in an appropriate vector producing cell line (VPCL) thatexpresses an envelope gene with a tropism for receptors on the targetcells or a promiscuous envelope protein such as VSVg (Armentano, D. etal. (1987) J. Virol. 61:1647-1650; Bender, M. A. et al. (1987) J. Virol.61:1639-1646; Adam, M. A. and A. D. Miller (1988) J. Virol.62:3802-3806; Dull, T. et al. (1998) J. Virol. 72:8463-8471; Zufferey,R. et al. (1998) J. Virol. 72:9873-9880). U.S. Pat. No. 5,910,434 toRigg (“Method for obtaining retrovirus packaging cell lines producinghigh transducing efficiency retroviral supernatant” ) discloses a methodfor obtaining retrovirus packaging cell lines and is hereby incorporatedby reference. Propagation of retrovirus vectors, transduction of apopulation of cells (e.g., CD4⁺ T-cells), and the return of transducedcells to a patient are procedures well known to persons skilled in theart of gene therapy and have been well documented (Ranga, U. et al.(1997) J. Virol. 71:7020-7029; Bauer, G. et al. (1997) Blood89:2259-2267; Bonyhadi, M. L. (1997) J. Virol. 71:4707-4716; Ranga, U.et al. (1998) Proc. Natl. Acad. Sci. USA 95:1201-1206; Su, L. (1997)Blood 89:2283-2290).

[0200] In the alternative, an adenovirus-based gene therapy deliverysystem is used to deliver polynucleotides encoding MDDT to cells whichhave one or more genetic abnormalities with respect to the expression ofMDDT. The construction and packaging of adenovirus-based vectors arewell known to those with ordinary skill in the art. Replicationdefective adenovirus vectors have proven to be versatile for importinggenes encoding immunoregulatory proteins into intact islets in thepancreas (Csete, M. E. et al. (1995) Transplantation 27:263-268).Potentially useful adenoviral vectors are described in U.S. Pat. No.5,707,618 to Armentano (“Adenovirus vectors for gene therapy”), herebyincorporated by reference. For adenoviral vectors, see also Antinozzi,P. A. et al. (1999) Annu. Rev. Nutr. 19:511-544 and Verma, I. M. and N.Somia (1997) Nature 18:389:239-242, both incorporated by referenceherein.

[0201] In another alternative, a herpes-based, gene therapy deliverysystem is used to deliver polynucleotides encoding MDDT to target cellswhich have one or more genetic abnormalities with respect to theexpression of MDDT. The use of herpes simplex virus (HSV)-based vectorsmay be especially valuable for introducing MDDT to cells of the centralnervous system, for which HSV has a tropism. The construction andpackaging of herpes-based vectors are well known to those with ordinaryskill in the art. A replication-competent herpes simplex virus (HSV)type 1-based vector has been used to deliver a reporter gene to the eyesof primates (Liu, X. et al. (1999) Exp. Eye Res. 169:385-395). Theconstruction of a HSV-1 virus vector has also been disclosed in detailin U.S. Pat. No. 5,804,413 to DeLuca (“Herpes simplex virus strains forgene transfer” ), which is hereby incorporated by reference. U.S. Pat.No. 5,804,413 teaches the use of recombinant HSV d92 which consists of agenome containing at least one exogenous gene to be transferred to acell under the control of the appropriate promoter for purposesincluding human gene therapy. Also taught by this patent are theconstruction and use of recombinant HSV strains deleted for ICP4, ICP27and ICP22. For HSV vectors, see also Goins, W. F. et al. (1999) J.Virol. 73:519-532 and Xu, H. et al. (1994) Dev. Biol. 163:152-161,hereby incorporated by reference. The manipulation of cloned herpesvirussequences, the generation of recombinant virus following thetransfection of multiple plasmids containing different segments of thelarge herpesvirus genomes, the growth and propagation of herpesvirus,and the infection of cells with herpesvirus are techniques well known tothose of ordinary skill in the art.

[0202] In another alternative, an alphavirus (positive, single-strandedRNA virus) vector is used to deliver polynucleotides encoding MDDT totarget cells. The biology of the prototypic alphavirus, Semliki ForestVirus (SFV), has been studied extensively and gene transfer vectors havebeen based on the SFV genome (Garoff, H. and K.-J. Li (1998) Curr. Opin.Biotechnol. 9:464-469). During alphavirus RNA replication, a subgenomicRNA is generated that normally encodes the viral capsid proteins. Thissubgenomic RNA replicates to higher levels than the full length genomicRNA, resulting in the overproduction of capsid proteins relative to theviral proteins with enzymatic activity (e.g., protease and polymerase).Similarly, inserting the coding sequence for MDDT into the alphavirusgenome in place of the capsid-coding region results in the production ofa large number of MDDT-coding RNAs and the synthesis of high levels ofMDDT in vector transduced cells. While alphavirus infection is typicallyassociated with cell lysis within a few days, the ability to establish apersistent infection in hamster normal kidney cells (BHK-21) with avariant of Sindbis virus (SIN) indicates that the lytic replication ofalphaviruses can be altered to suit the needs of the gene therapyapplication (Dryga, S. A. et al. (1997) Virology 228:74-83). The widehost range of alphaviruses will allow the introduction of MDDT into avariety of cell types. The specific transduction of a subset of cells ina population may require the sorting of cells prior to transduction. Themethods of manipulating infectious cDNA clones of alpbaviruses,performing alphavirus cDNA and RNA transfections, and performingalphavirus infections, are well known to those with ordinary skill inthe art.

[0203] Oligonucleotides derived from the transcription initiation site,e.g., between about positions −10 and +10 from the start site, may alsobe employed to inhibit gene expression. Similarly, inhibition can beachieved using triple helix base-pairing methodology. Triple helixpairing is useful because it causes inhibition of the ability of thedouble helix to open sufficiently for the binding of polymerases,transcription factors, or regulatory molecules. Recent therapeuticadvances using triplex DNA have been described in the literature. (See,e.g., Gee, J. E. et al. (1994) in Huber, B. E. and B. I. Carr, Molecularand Immunologic Approaches, Futura Publishing, Mt. Kisco N.Y., pp.163-177.) A complementary sequence or antisense molecule may also bedesigned to block translation of mRNA by preventing the transcript frombinding to ribosomes.

[0204] Ribozymes, enzymatic RNA molecules, may also be used to catalyzethe specific cleavage of RNA. The mechanism of ribozyme action involvessequence-specific hybridization of the ribozyme molecule tocomplementary target RNA, followed by endonucleolytic cleavage. Forexample, engineered hammerhead motif ribozyme molecules may specificallyand efficiently catalyze endonucleolytic cleavage of sequences encodingMDDT.

[0205] Specific ribozyme cleavage sites within any potential RNA targetare initially identified by scanning the target molecule for ribozymecleavage sites, including the following sequences: GUA, GUU, and GUC.Once identified, short RNA sequences of between 15 and 20ribonucleotides, corresponding to the region of the target genecontaining the cleavage site, may be evaluated for secondary structuralfeatures which may render the oligonucleotide inoperable. Thesuitability of candidate targets may also be evaluated by testingaccessibility to hybridization with complementary oligonucleotides usingribonuclease protection assays.

[0206] Complementary ribonucleic acid molecules and ribozymes of theinvention may be prepared by any method known in the art for thesynthesis of nucleic acid molecules. These include techniques forchemically synthesizing oligonucleotides such as solid phasephosphoramidite chemical synthesis. Alternatively, RNA molecules may begenerated by in vitro and in vivo transcription of DNA sequencesencoding MDDT. Such DNA sequences may be incorporated into a widevariety of vectors with suitable RNA polymerase promoters such as T7 orSP6. Alternatively, these cDNA constructs that synthesize complementaryRNA, constitutively or inducibly, can be introduced into cell lines,cels, or tissues.

[0207] RNA molecules may be modified to increase intracellular stabilityand half-life. Possible modifications include, but are not limited to,the addition of flanking sequences at the5′ and/or 3′ ends. of themolecule, or the use of phosphorothioate or 2′ O-methyl rather thanphosphodiesterase linkages within the backbone of the molecule. Thisconcept is inherent in the production of PNAs and can be extended in allof these molecules by the inclusion of nontraditional bases such asinosine, queosine, and wybutosine, as well as acetyl-, methyl-, thio-,and similarly modified forms of adenine, cytidine, guanine, thymine, anduridine which are not as easily recognized by endogenous endonucleases.

[0208] An additional embodiment of the invention encompasses a methodfor screening for a compound which is effective in altering expressionof a polynucleotide encoding MDDT. Compounds which may be effective inaltering expression of a specific polynucleotide may include, but arenot limited to, oligonucleotides, antisense oligonucleotides, triplehelix-forming oligonucleotides, transcription factors and otherpolypeptide transcriptional regulators, and non-macromolecular chemicalentities which are capable of interacting with specific polynucleotidesequences. Effective compounds may alter polynucleotide expression byacting as either inhibitors or promoters of polynucleotide expression.Thus, in the treatment of disorders associated with increased MDDTexpression or activity, a compound which specifically inhibitsexpression of the polynucleotide encoding MDDT may be therapeuticallyuseful, and in the treatment of disorders associated with decreased MDDTexpression or activity, a compound which specifically promotesexpression of the polynucleotide encoding MDDT may be therapeuticallyuseful.

[0209] At least one, and up to a plurality, of test compounds may bescreened for effectiveness in altering expression of a specificpolynucleotide. A test compound may be obtained by any method commonlyknown in the art, including chemical modification of a compound known tobe effective in altering polynucleotide expression; selection from anexisting, commercially-available or proprietary library ofnaturally-occurring or non-natural chemical compounds; rational designof a compound based on chemical and/or structural properties of thetarget polynucleotide; and selection from a library of chemicalcompounds created combinatorially or randomly. A sample comprising apolynucleotide encoding MDDT is exposed to at least one test compoundthus obtained. The sample may comprise, for example, an intact orpermeabilized cell, or an in vitro cell-free or reconstitutedbiochemical system. Alterations in the expression of a polynucleotideencoding MDDT are assayed by any method commonly known in the art.Typically, the expression of a specific nucleotide is detected byhybridization with a probe having a nucleotide sequence complementary tothe sequence of the polynucleotide encoding MDDT. The amount ofhybridization may be quantified, thus forming the basis for a comparisonof the expression of the polynucleotide both with and without exposureto one or more test compounds. Detection of a change in the expressionof a polynucleotide exposed to a test compound indicates that the testcompound is effective in altering the expression of the polynucleotide.A screen for a compound effective in altering expression of a specificpolynucleotide can be carried out, for example, using aSchizosaccharomyces pombe gene expression system (Atkins, D. et al.(1999) U.S. Pat. No. 5,932,435; Arndt, G. M. et al. (2000) Nucleic AcidsRes. 28:E15) or a human cell line such as HeLa cell (Clarke, M. L. etal. (2000) Biochem. Biophys. Res. Commun. 268:8-13). A particularembodiment of the present invention involves screening a combinatoriallibrary of oligonucleotides (such as deoxyribonucleotides,ribonucleotides, peptide nucleic acids, and modified oligonucleotides)for antisense activity against a specific polynucleotide sequence(Bruice, T. W. et al. (1997) U.S. Pat. No. 5,686,242; Bruice, T. W. etal. (2000) U.S. Pat. No. 6,022,691).

[0210] Many methods for introducing vectors into cells or tissues areavailable and equally suitable for use in vivo, in vitro, and ex vivo.For ex vivo therapy, vectors may be introduced into stem cells takenfrom the patient and clonally propagated for autologous transplant backinto that same patient. Delivery by transfection, by liposomeinjections, or by polycationic amino polymers may be achieved usingmethods which are well known in the art. (See, e.g., Goldman, C. K. etal. (1997) Nat. Biotechnol. 15:462-466.)

[0211] Any of the therapeutic methods described above may be applied toany subject in need of such therapy, including, for example, mammalssuch as humans, dogs, cats, cows, horses, rabbits, and monkeys.

[0212] An additional embodiment of the invention relates to theadministration of a composition which generally comprises an activeingredient formulated with a pharmaceutically acceptable excipient.Excipients may include, for example, sugars, starches, celluloses, gums,and proteins. Various formulations are commonly known and are thoroughlydiscussed in the latest edition of Remington's Pharmaceutical Sciences(Maack Publishing, Easton Pa.). Such compositions may consist of MDDT,antibodies to MDDT, and mimetics, agonists, antagonists, or inhibitorsof MDDT.

[0213] The compositions utilized in this invention may be administeredby any number of routes including, but not limited to, oral,intravenous, intramuscular, intra-arterial, intramedullary, intrathecal,intraventricular, pulmonary, transdermal, subcutaneous, intraperitoneal,intranasal, enteral, topical, sublingual, or rectal means.

[0214] Compositions for pulmonary administration may be prepared inliquid or dry powder form. These compositions are generally aerosolizedimmediately prior to inhalation by the patient. In the case of smallmolecules (e.g. traditional low molecular weight organic drugs), aerosoldelivery of fast-acting formulations is well-known in the art. In thecase of macromolecules (e.g. larger peptides and proteins), recentdevelopments in the field of pulmonary delivery via the alveolar regionof the lung have enabled the practical delivery of drugs such as insulinto blood circulation (see, e.g., Patton, J. S. et al., U.S. Pat. No.5,997,848). Pulmonary delivery has the advantage of administrationwithout needle injection, and obviates the need for potentially toxicpenetration enhancers.

[0215] Compositions suitable for use in the invention includecompositions wherein the active ingredients are contained in aneffective amount to achieve the intended purpose. The determination ofan effective dose is well within the capability of those skilled in theart.

[0216] Specialized forms of compositions may be prepared for directintracellular delivery of macromolecules comprising MDDT or fragmentsthereof. For example, liposome preparations containing acell-impermeable macromolecule may promote cell fusion and intracellulardelivery of the macromolecule. Alternatively, MDDT or a fragment thereofmay be joined to a short cationic N-terminal portion from the HIV Tat-1protein. Fusion proteins thus generated have been found to transduceinto the cells of all tissues, including the brain, in a mouse modelsystem (Schwarze, S. R. et al. (1999) Science 285:1569-1572).

[0217] For any compound, the therapeutically effective dose can beestimated initially either in cell culture assays, e.g., of neoplasticcells, or in animal models such as mice, rats, rabbits, dogs, monkeys,or pigs. An animal model may also be used to determine the appropriateconcentration range and route of administration. Such information canthen be used to determine useful doses and routes for administration inhumans.

[0218] A therapeutically effective dose refers to that amount of activeingredient, for example MDDT or fragments thereof, antibodies of MDDT,and agonists, antagonists or inhibitors of MDDT, which ameliorates thesymptoms or condition. Therapeutic efficacy and toxicity may bedetermined by standard pharmaceutical procedures in cell cultures orwith experimental animals, such as by calculating the ED₅₀ (the dosetherapeutically effective in 50% of the population) or LD₅₀ (the doselethal to 50% of the population) statistics. The dose ratio of toxic totherapeutic effects is the therapeutic index, which can be expressed asthe LD₅₀/ED₅₀ ratio. Compositions which exhibit large therapeuticindices are preferred. The data obtained from cell culture assays andanimal studies are used to formulate a range of dosage for human use.The dosage contained in such compositions is preferably within a rangeof circulating concentrations that includes the ED₅₀ with little or notoxicity. The dosage varies within this range depending upon the dosageform employed, the sensitivity of the patient, and the route ofadministration.

[0219] The exact dosage will be determined by the practitioner, in lightof factors related to the subject requiring treatment. Dosage andadministration are adjusted to provide sufficient levels of the activemoiety or to maintain the desired effect. Factors which may be takeninto account include the severity of the disease state, the generalhealth of the subject, the age, weight, and gender of the subject, timeand frequency of administration, drug combination(s), reactionsensitivities, and response to therapy. Long-acting compositions may beadministered every 3 to 4 days, every week, or biweekly depending on thehalf-life and clearance rate of the particular formulation.

[0220] Normal dosage amounts may vary from about 0.1 μg to 100,000 μg,up to a total dose of about 1 gram, depending upon the route ofadministration. Guidance as to particular dosages and methods ofdelivery is provided in the literature and generally available topractitioners in the art. Those skilled in the art will employ differentformulations for nucleotides than for proteins or their inhibitors.Similarly, delivery of polynucleotides or polypeptides will be specificto particular cells, conditions, locations, etc.

[0221] DIAGNOSTICS

[0222] In another embodiment, antibodies which specifically bind MDDTmay be used for the diagnosis of disorders characterized by expressionof MDDT, or in assays to monitor patients being treated with MDDT oragonists, antagonists, or inhibitors of MDDT. Antibodies useful fordiagnostic purposes may be prepared in the same manner as describedabove for therapeutics. Diagnostic assays for MDDT include methods whichutilize the antibody and a label to detect MDDT in human body fluids orin extracts of cells or tissues. The antibodies may be used with orwithout modification, and may be labeled by covalent or non-covalentattachment of a reporter molecule. A wide variety of reporter molecules,several of which are described above, are known in the art and may beused.

[0223] A variety of protocols for measuring MDDT, including ELISAs,RIAs, and FACS, are known in the art and provide a basis for diagnosingaltered or abnormal levels of MDDT expression. Normal or standard valuesfor MDDT expression are established by combining body fluids or cellextracts taken from normal mammalian subjects, for example, humansubjects, with antibodies to MDDT under conditions suitable for complexformation. The amount of standard complex formation may be quantitatedby various methods, such as photometric means. Quantities of MDDTexpressed in subject, control, and disease samples from biopsied tissuesare compared with the standard values. Deviation between standard andsubject values establishes the parameters for diagnosing disease.

[0224] In another embodiment of the invention, the polynucleotidesencoding MDDT may be used for diagnostic purposes. The polynucleotideswhich may be used include oligonucleotide sequences, complementary RNAand DNA molecules, and PNAs. The polynucleotides may be used to detectand quantify gene expression in biopsied tissues in which expression ofMDDT may be correlated with disease. The diagnostic assay may be used todetermine absence, presence, and excess expression of MDDT, and tomonitor regulation of MDDT levels during therapeutic intervention.

[0225] In one aspect, hybridization with PCR probes which are capable ofdetecting polynucleotide sequences, including genomic sequences,encoding MDDT or closely related molecules may be used to identifynucleic acid sequences which encode MDDT. The specificity of the probe,whether it is made from a highly specific region, e.g., the 5′regulatory region, or from a less specific region, e.g., a conservedmotif, and the stringency of the hybridization or amplification willdetermine whether the probe identifies only naturally occurringsequences encoding MDDT, allelic variants, or related sequences.

[0226] Probes may also be used for the detection of related sequences,and may have at least 50% sequence identity to any of the MDDT encodingsequences. The hybridization probes of the subject invention may be DNAor RNA and may be derived from the sequence of SEQ ID NO:27-52 or fromgenomic sequences including promoters, enhancers, and introns of theMDDT gene.

[0227] Means for producing specific hybridization probes for DNAsencoding MDDT include the cloning of polynucleotide sequences encodingMDDT or MDDT derivatives into vectors for the production of mRNA probes.Such vectors are known in the art, are commercially available, and maybe used to synthesize RNA probes in vitro by means of the addition ofthe appropriate RNA polymerases and the appropriate labeled nucleotides.Hybridization probes may be labeled by a variety of reporter groups, forexample, by radionuclides such as ³²P or ³⁵S, or by enzymatic labels,such as alkaline phosphatase coupled to the probe via avidin/biotincoupling systems, and the like.

[0228] Polynucleotide sequences encoding MDDT may be used for thediagnosis of disorders associated with expression of MDDT. Examples ofsuch disorders include, but are not limited to, a cell proliferativedisorder such as actinic keratosis, arteriosclerosis, atherosclerosis,bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD),myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera,psoriasis, primary thrombocythemia, and cancers includingadenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma,teratocarcinoma, and, in particular, cancers of the adrenal gland,bladder, bone, bone marrow, brain, breast, cervix, gall bladder,ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle,ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin,spleen, testis, thymus, thyroid, and uterus; an autoimmune/inflammatorydisorder such as acquired immunodeficiency syndrome (AIDS), Addison'sdisease, adult respiratory distress syndrome, allergies, ankylosingspondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmunehemolytic anemia, autoimmune thyroiditis, autoimmunepolyendocrinopathy-candidiasis-ectodermal dystrophy (APECED),bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopicdermatitis, dermatomyositis, diabetes melitus, emphysema, episodiclymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythemanodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome,gout, Graves' disease, Hashimoto's thyroiditis, hypereosinophilia,irritable bowel syndrome, multiple sclerosis, myasthenia gravis,myocardial or pericardial inflammation, osteoarthritis, osteoporosis,pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoidarthritis, scleroderma, Sjögren's syndrome, systemic anaphylaxis,systemic lupus erythematosus, systemic sclerosis, thrombocytopenicpurpura, ulcerative colitis, uveitis, Werner syndrome, complications ofcancer, hemodialysis, and extracorporeal circulation, viral, bacterial,fungal, parasitic, protozoal, and helminthic infections, and trauma; adevelopmental disorder such as renal tubular acidosis, anemia, Cushing'ssyndrome, achondroplastic dwarfism, Duchenne and Becker musculardystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms' tumor,aniridia, genitourinary abnormalities, and mental retardation),Smith-Magenis syndrome, myelodysplastic syndrome, hereditarymucoepithelial dysplasia, hereditary keratodermas, hereditaryneuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis,hypothyroidism, hydrocephalus, seizure disorders such as Syndenham'schorea and cerebral palsy, spina bifida, anencephaly,craniorachischisis, congenital glaucoma, cataract, and sensorineuralhearing loss; and a neurological disorder such as epilepsy, ischemiccerebrovascular disease, stroke, cerebral neoplasms, Alzheimer'sdisease, Pick's disease, Huntington's disease, dementia, Parkinson'sdisease and other extrapyramidal disorders, amyotrophic lateralsclerosis and other motor neuron disorders, progressive neural muscularatrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosisand other demyelinating diseases, bacterial and viral meningitis, brainabscess, subdural empyema, epidural abscess, suppurative intracranialthrombophlebitis, myelitis and radiculitis, viral central nervous systemdisease, prion diseases including kuru, Creutzfeldt-Jakob disease, andGerstmann-Straussler-Scheinker syndrome, fatal familial insomnia,nutritional and metabolic diseases of the nervous system,neurofibromatosis, tuberous sclerosis, cerebelloretinalhemangioblastomatosis, encephalotrigeminal syndrome, mental retardationand other developmental disorders of the central nervous systemincluding Down syndrome, cerebral palsy, neuroskeletal disorders,autonomic nervous system disorders, cranial nerve disorders, spinal corddiseases, muscular dystrophy and other neuromuscular disorders,peripheral nervous system disorders, dermatomyositis and polymyositis,inherited, metabolic, endocrine, and toxic myopathies, myastheniagravis, periodic paralysis, mental disorders including mood, anxiety,and schizophrenic disorders, seasonal affective disorder (SAD),akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia,dystonias, paranoid psychoses, postherpetic neuralgia, Tourette'sdisorder, progressive supranuclear palsy, corticobasal degeneration, andfamilial frontotemporal dementia. The polynucleotide sequences encodingMDDT may be used in Southern or northern analysis, dot blot, or othermembrane-based technologies; in PCR technologies; in dipstick, pin, andmultiformat ELISA-like assays; and in microarrays utilizing fluids ortissues from patients to detect altered MDDT expression. Suchqualitative or quantitative methods are well known in the art.

[0229] In a particular aspect, the nucleotide sequences encoding MDDTmay be useful in assays that detect the presence of associateddisorders, particularly those mentioned above. The nucleotide sequencesencoding MDDT may be labeled by standard methods and added to a fluid ortissue sample from a patient under conditions suitable for the formationof hybridization complexes. After a suitable incubation period, thesample is washed and the signal is quantified and compared with astandard value. If the amount of signal in the patient sample issignificantly altered in comparison to a control sample then thepresence of altered levels of nucleotide sequences encoding MDDT in thesample indicates the presence of the associated disorder. Such assaysmay also be used to evaluate the efficacy of a particular therapeutictreatment regimen in animal studies, in clinical trials, or to monitorthe treatment of an individual patient.

[0230] In order to provide a basis for the diagnosis of a disorderassociated with expression of MDDT, a normal or standard profile forexpression is established. This may be accomplished by combining bodyfluids or cell extracts taken from normal subjects, either animal orhuman, with a sequence, or a fragment thereof, encoding MDDT, underconditions suitable for hybridization or amplification. Standardhybridization may be quantified by comparing the values obtained fromnormal subjects with values from an experiment in which a known amountof a substantially purified polynucleotide is used. Standard valuesobtained in this manner may be compared with values obtained fromsamples from patients who are symptomatic for a disorder. Deviation fromstandard values is used to establish the presence of a disorder.

[0231] Once the presence of a disorder is established and a treatmentprotocol is initiated, hybridization assays may be repeated on a regularbasis to determine if the level of expression in the patient begins toapproximate that which is observed in the normal subject. The resultsobtained from successive assays may be used to show the efficacy oftreatment over a period ranging from several days to months.

[0232] With respect to cancer, the presence of an abnormal amount oftranscript (either under- or overexpressed) in biopsied tissue from anindividual may indicate a predisposition for the development of thedisease, or may provide a means for detecting the disease prior to theappearance of actual clinical symptoms. A more definitive diagnosis ofthis type may allow health professionals to employ preventative measuresor aggressive treatment earlier thereby preventing the development orfurther progression of the cancer.

[0233] Additional diagnostic uses for oligonucleotides designed from thesequences encoding MDDT may involve the use of PCR. These oligomers maybe chemically synthesized, generated enzymatically, or produced invitro. Oligomers will preferably contain a fragment of a polynucleotideencoding MDDT, or a fragment of a polynucleotide complementary to thepolynucleotide encoding MDDT, and will be employed under optimizedconditions for identification of a specific gene or condition. Oligomersmay also be employed under less stringent conditions for detection orquantification of closely related DNA or RNA sequences.

[0234] In a particular aspect, oligonucleotide primers derived from thepolynucleotide sequences encoding MDDT may be used to detect singlenucleotide polymorphisms (SNPs). SNPs are substitutions, insertions anddeletions that are a frequent cause of inherited or acquired geneticdisease in humans. Methods of SNP detection include, but are not limitedto, single-stranded conformation polymorphism (SSCP) and fluorescentSSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived from thepolynucleotide sequences encoding MDDT are used to amplify DNA using thepolymerase chain reaction (PCR). The DNA may be derived, for example,from diseased or normal tissue, biopsy samples, bodily fluids, and thelike. SNPs in the DNA cause differences in the secondary and tertiarystructures of PCR products in single-stranded form, and thesedifferences are detectable using gel electrophoresis in non-denaturinggels. In fSCCP, the oligonucleotide primers are fluorescently labeled,which allows detection of the amplimers in high-throughput equipmentsuch as DNA sequencing machines. Additionally, sequence databaseanalysis methods, termed in silico SNP (isSNP), are capable ofidentifying polymorphisms by comparing the sequence of individualoverlapping DNA fragments which assemble into a common consensussequence. These computer-based methods filter out sequence variationsdue to laboratory preparation of DNA and sequencing errors usingstatistical models and automated analyses of DNA sequence chromatograms.In the alternative, SNPs may be detected and characterized by massspectrometry using, for example, the high throughput MASSARRAY system(Sequenom, Inc., San Diego Calif.).

[0235] Methods which may also be used to quantify the expression of MDDTinclude radiolabeling or biotinylating nucleotides, coamplification of acontrol nucleic acid, and interpolating results from standard curves.(See, e.g., Melby, P. C. et al. (1993) J. Immunol. Methods 159:235-244;Duplaa, C. et al. (1993) Anal. Biochem. 212:229-236.) The speed ofquantitation of multiple samples may be accelerated by running the assayin a high-throughput format where the oligomer or polynucleotide ofinterest is presented in various dilutions and a spectrophotometric orcolorimetric response gives rapid quantitation.

[0236] In further embodiments, oligonucleotides or longer fragmentsderived from any of the polynucleotide sequences described herein may beused as elements on a microarray. The microarray can be used intranscript imaging techniques which monitor the relative expressionlevels of large numbers of genes simultaneously as described below. Themicroarray may also be used to identify genetic variants, mutations, andpolymorphisms. This information may be used to determine gene function,to understand the genetic basis of a disorder, to diagnose a disorder,to monitor progression/regression of disease as a function of geneexpression, and to develop and monitor the activities of therapeuticagents in the treatment of disease. In particular, this information maybe used to develop a pharmacogenomic profile of a patient in order toselect the most appropriate and effective treatment regimen for thatpatient. For example, therapeutic agents which are highly effective anddisplay the fewest side effects may be selected for a patient based onhis/her pharmacogenomic profile.

[0237] In another embodiment, MDDT, fragments of MDDT, or antibodiesspecific for MDDT may be used as elements on a microarray. Themicroarray may be used to monitor or measure protein-proteininteractions, drug-target interactions, and gene expression profiles, asdescribed above.

[0238] A particular embodiment relates to the use of the polynucleotidesof the present invention to generate a transcript image of a tissue orcell type. A transcript image represents the global pattern of geneexpression by a particular tissue or cell type. Global gene expressionpatterns are analyzed by quantifying the number of expressed genes andtheir relative abundance under given conditions and at a given time.(See Seilhamer et al., “Comparative Gene Transcript Analysis,” U.S. Pat.No. 5,840,484, expressly incorporated by reference herein.) Thus atranscript image may be generated by hybridizing the polynucleotides ofthe present invention or their complements to the totality oftranscripts or reverse transcripts of a particular tissue or cell type.In one embodiment, the hybridization takes place in high-throughputformat, wherein the polynucleotides of the present invention or theircomplements comprise a subset of a plurality of elements on amicroarray. The resultant transcript image would provide a profile ofgene activity.

[0239] Transcript images may be generated using transcripts isolatedfrom tissues, cell lines, biopsies, or other biological samples. Thetranscript image may thus reflect gene expression in vivo, as in thecase of a tissue or biopsy sample, or in vitro, as in the case of a cellline.

[0240] Transcript images which profile the expression of thepolynucleotides of the present invention may also be used in conjunctionwith in vitro model systems and preclinical evaluation ofpharmaceuticals, as well as toxicological testing of industrial andnaturally-occurring environmental compounds. All compounds inducecharacteristic gene expression patterns, frequently termed molecularfingerprints or toxicant signatures, which are indicative of mechanismsof action and toxicity (Nuwaysir, E. F. et al. (1999) Mol. Carcinog.24:153-159; Steiner, S. and N. L. Anderson (2000) Toxicol. Lett.112-113:467-471, expressly incorporated by reference herein). If a testcompound has a signature similar to that of a compound with knowntoxicity, it is likely to share those toxic properties. Thesefingerprints or signatures are most useful and refined when they containexpression information from a large number of genes and gene families.Ideally, a genome-wide measurement of expression provides the highestquality signature. Even genes whose expression is not altered by anytested compounds are important as well, as the levels of expression ofthese genes are used to normalize the rest of the expression data. Thenormalization procedure is useful for comparison of expression dataafter treatment with different compounds. While the assignment of genefunction to elements of a toxicant signature aids in interpretation oftoxicity mechanisms, knowledge of gene function is not necessary for thestatistical matching of signatures which leads to prediction oftoxicity. (See, for example, Press Release 00-02 from the NationalInstitute of Environmental Health Sciences, released Feb. 29, 2000,available at http://www.niehs.nih.gov/oc/news/toxchip.htm.) Therefore,it is important and desirable in toxicological screening using toxicantsignatures to include all expressed gene sequences.

[0241] In one embodiment, the toxicity of a test compound is assessed bytreating a biological sample containing nucleic acids with the testcompound. Nucleic acids that are expressed in the treated biologicalsample are hybridized with one or more probes specific to thepolynucleotides of the present invention, so that transcript levelscorresponding to the polynucleotides of the present invention may bequantified. The transcript levels in the treated biological sample arecompared with levels in an untreated biological sample. Differences inthe transcript levels between the two samples are indicative of a toxicresponse caused by the test compound in the treated sample.

[0242] Another particular embodiment relates to the use of thepolypeptide sequences of the present invention to analyze the proteomeof a tissue or cell type. The term proteome refers to the global patternof protein expression in a particular tissue or cell type. Each proteincomponent of a proteome can be subjected individually to furtheranalysis. Proteome expression patterns, or profiles, are analyzed byquantifying the number of expressed proteins and their relativeabundance under given conditions and at a given time. A profile of acell's proteome may thus be generated by separating and analyzing thepolypeptides of a particular tissue or cell type. In one embodiment, theseparation is achieved using two-dimensional gel electrophoresis, inwhich proteins from a sample are separated by isoelectric focusing inthe first dimension, and then according to molecular weight by sodiumdodecyl sulfate slab gel electrophoresis in the second dimension(Steiner and Anderson, supra). The proteins are visualized in the gel asdiscrete and uniquely positioned spots, typically by staining the gelwith an agent such as Coomassie Blue or silver or fluorescent stains.The optical density of each protein spot is generally proportional tothe level of the protein in the sample. The optical densities ofequivalently positioned protein spots from different samples, forexample, from biological samples either treated or untreated with a testcompound or therapeutic agent, are compared to identify any changes inprotein spot density related to the treatment. The proteins in the spotsare partially sequenced using, for example, standard methods employingchemical or enzymatic cleavage followed by mass spectrometry. Theidentity of the protein in a spot may be determined by comparing itspartial sequence, preferably of at least 5 contiguous amino acidresidues, to the polypeptide sequences of the present invention. In somecases, further sequence data may be obtained for definitive proteinidentification.

[0243] A proteomic profile may also be generated using antibodiesspecific for MDDT to quantify the levels of MDDT expression. In oneembodiment, the antibodies are used as elements on a microarray, andprotein expression levels are quantified by exposing the microarray tothe sample and detecting the levels of protein bound to each arrayelement (Lueking, A. et al. (1999) Anal. Biochem. 270:103-111; Mendoze,L. G. et al. (1999) Biotechniques 27:778-788). Detection maybe performedby a variety of methods known in the art, for example, by reacting theproteins in the sample with a thiol-or amino-reactive fluorescentcompound and detecting the amount of fluorescence bound at each arrayelement.

[0244] Toxicant signatures at the proteome level are also useful fortoxicological screening, and should be analyzed in parallel withtoxicant signatures at the transcript level. There is a poor correlationbetween transcript and protein abundances for some proteins in sometissues (Anderson, N. L. and J. Seilhamer (1997) Electrophoresis18:533-537), so proteome toxicant signatures may be useful in theanalysis of compounds which do not significantly affect the transcriptimage, but which alter the proteomic profile. In addition, the analysisof transcripts in body fluids is difficult, due to rapid degradation ofMRNA, so proteomic profiling may be more reliable and informative insuch cases.

[0245] In another embodiment, the toxicity of a test compound isassessed by treating a biological sample containing proteins with thetest compound. Proteins that are expressed in the treated biologicalsample are separated so that the amount of each protein can bequantified. The amount of each protein is compared to the amount of thecorresponding protein in an untreated biological sample. A difference inthe amount of protein between the two samples is indicative of a toxicresponse to the test compound in the treated sample. Individual proteinsare identified by sequencing the amino acid residues of the individualproteins and comparing these partial sequences to the polypeptides ofthe present invention.

[0246] In another embodiment, the toxicity of a test compound isassessed by treating a biological sample containing proteins with thetest compound. Proteins from the biological sample are incubated withantibodies specific to the polypeptides of the present invention. Theamount of protein recognized by the antibodies is quantified. The amountof protein in the treated biological sample is compared with the amountin an untreated biological sample. A difference in the amount of proteinbetween the two samples is indicative of a toxic response to the testcompound in the treated sample.

[0247] Microarrays may be prepared, used, and analyzed using methodsknown in the art. (See, e.g., Brennan, T. M. et al. (1995) U.S. Pat. No.5,474,796; Schena, M. et al. (1996) Proc. Nad. Acad. Sci. USA93:10614-10619; Baldeschweiler et al. (1995) PCT applicationWO95/251116; Shalon, D. et al. (1995) PCT application WO95/35505;Heller, R. A. et al. (1997) Proc. Natl. Acad. Sci. USA 94:2150-2155; andHeller, M. J. et al. (1997) U.S. Pat. No. 5,605,662.) Various types ofmicroarrays are well known and thoroughly described in DNA Microarrays:A Practical Approach, M. Schena, ed. (1999) Oxford University Press,London, hereby expressly incorporated by reference.

[0248] In another embodiment of the invention, nucleic acid sequencesencoding MDDT may be used to generate hybridization probes useful inmapping the naturally occurring genomic sequence. Either coding ornoncoding sequences may be used, and in some instances, noncodingsequences may be preferable over coding sequences. For example,conservation of a coding sequence among members of a multi-gene familymay potentially cause undesired cross hybridization during chromosomalmapping. The sequences may be mapped to a particular chromosome, to aspecific region of a chromosome, or to artificial chromosomeconstructions, e.g., human artificial chromosomes (HACs), yeastartificial chromosomes (YACs), bacterial artificial chromosomes (BACs),bacterial P1 constructions, or single chromosome cDNA libraries. (See,e.g., Harrington, J. J. et al. (1997) Nat. Genet. 15:345-355; Price, C.M. (1993) Blood Rev. 7:127-134; and Trask, B. J. (1991) Trends Genet.7:149-154.) Once mapped, the nucleic acid sequences of the invention maybe used to develop genetic linkage maps, for example, which correlatethe inheritance of a disease state with the inheritance of a particularchromosome region or restriction fragment length polymorphism (RFLP).(See, for example, Lander, E. S. and D. Botstein (1986) Proc. Natl.Acad. Sci. USA 83:7353-7357.)

[0249] Fluorescent in situ hybridization (FISH) may be correlated withother physical and genetic map data. (See, e.g., Heinz-Ulrich, et al.(1995) in Meyers, supra, pp. 965-968.) Examples of genetic map data canbe found in various scientific journals or at the Online MendelianInheritance in Man (OMIM) World Wide Web site. Correlation between thelocation of the gene encoding MDDT on a physical map and a specificdisorder, or a predisposition to a specific disorder, may help definethe region of DNA associated with that disorder and thus may furtherpositional cloning efforts.

[0250] In situ hybridization of chromosomal preparations and physicalmapping techniques, such as linkage analysis using establishedchromosomal markers, may be used for extending genetic maps. Often theplacement of a gene on the chromosome of another mammalian species, suchas mouse, may reveal associated markers even if the exact chromosomallocus is not known. This information is valuable to investigatorssearching for disease genes using positional cloning or other genediscovery techniques. Once the gene or genes responsible for a diseaseor syndrome have been crudely localized by genetic linkage to aparticular genomic region, e.g., ataxia-telangiectasia to 11q22-23, anysequences mapping to that area may represent associated or regulatorygenes for further investigation. (See, e.g., Gatti, R. A. et al. (1988)Nature 336:577-580.) The nucleotide sequence of the instant inventionmay also be used to detect differences in the chromosomal location dueto translocation, inversion, etc., among normal, carrier, or affectedindividuals.

[0251] In another embodiment of the invention, MDDT, its catalytic orimmunogenic fragments, or oligopeptides thereof can be used forscreening libraries of compounds in any of a variety of drug screeningtechniques. The fragment employed in such screening may be free insolution, affixed to a solid support, borne on a cell surface, orlocated intracellularly. The formation of binding complexes between MDDTand the agent being tested may be measured.

[0252] Another technique for drug screening provides for high throughputscreening of compounds having suitable binding affinity to the proteinof interest. (See, e.g., Geysen, et al. (1984) PCT applicationWO84/03564.) In this method, large numbers of different small testcompounds are synthesized on a solid substrate. The test compounds arereacted with MDDT, or fragments thereof, and washed. Bound MDDT is thendetected by methods well known in the art. Purified MDDT can also becoated directly onto plates for use in the aforementioned drug screeningtechniques. Alternatively, non-neutralizing antibodies can be used tocapture the peptide and immobilize it on a solid support.

[0253] In another embodiment, one may use competitive drug screeningassays in which neutralizing antibodies capable of binding MDDTspecifically compete with a test compound for binding MDDT. In thismanner, antibodies can be used to detect the presence of any peptidewhich shares one or more antigenic determinants with MDDT.

[0254] In additional embodiments, the nucleotide sequences which encodeMDDT may be used in any molecular biology techniques that have yet to bedeveloped, provided the new techniques rely on properties of nucleotidesequences that are currently known, including, but not limited to, suchproperties as the triplet genetic code and specific base pairinteractions.

[0255] Without further elaboration, it is believed that one skilled inthe art can, using the preceding description, utilize the presentinvention to its fullest extent. The following embodiments are,therefore, to be construed as merely illustrative, and not limitative ofthe remainder of the disclosure in any way whatsoever.

[0256] The disclosures of all patents, applications and publications,mentioned above and below, including U.S. Ser. No. 60/260,168, U.S. Ser.No. 60/262,857, and U.S. Ser. No. 60/262,736, are expressly incorporatedby reference herein.

EXAMPLES

[0257] Incyte cDNAs were derived from cDNA libraries described in theLIFESEQ GOLD database (Incyte Genomics, Palo Alto Calif.). Some tissueswere homogenized and lysed in guanidinium isothiocyanate, while otherswere homogenized and lysed in phenol or in a suitable mixture ofdenaturants, such as TRIZOL (Life Technologies), a monophasic solutionof phenol and guanidine isothiocyanate. The resulting lysates werecentrifuged over CsCl cushions or extracted with chloroform. RNA wasprecipitated from the lysates with either isopropanol or sodium acetateand ethanol, or by other routine methods.

[0258] Phenol extraction and precipitation of RNA were repeated asnecessary to increase RNA purity. In some cases, RNA was treated withDNase. For most libraries, poly(A)+RNA was isolated using oligod(T)-coupled paramagnetic particles (Promega), OLIGOTEX latex particles(QIAGEN, Chatsworth Calif.), or an OLIGOTEX mRNA purification kit(QIAGEN). Alternatively, RNA was isolated directly from tissue lysatesusing other RNA isolation kits, e.g., the POLY(A)PURE mRNA purificationkit (Ambion, Austin TX).

[0259] In some cases, Stratagene was provided with RNA and constructedthe corresponding cDNA libraries. Otherwise, cDNA was synthesized andcDNA libraries were constructed with the UNIZAP vector system(Stratagene) or SUPERSCRIPT plasmid system (Life Technologies), usingthe recommended procedures or similar methods known in the art. (See,e.g., Ausubel, 1997, supra, units 5.1-6.6.) Reverse transcription wasinitiated using oligo d(T) or random primers. Synthetic oligonucleotideadapters were ligated to double stranded cDNA, and the cDNA was digestedwith the appropriate restriction enzyme or enzymes. For most libraries,the cDNA was size-selected (300-1000 bp) using SEPHACRYL S1000,SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (AmershamPharmacia Biotech) or preparative agarose gel electrophoresis. cDNAswere ligated into compatible restriction enzyme sites of the polylinkerof a suitable plasmid, e.g., PBLUESCRIPT plasmid (Stratagene), PSPORT1plasmid (Life Technologies), PCDNA2.1 plasmid (Invitrogen, CarlsbadCalif.), PBK-CMV plasmid (Stratagene), PCR2-TOPOTA plasmid (Invitrogen),PCMV-ICIS plasmid (Stratagene), pIGEN (Incyte Genomics, Palo AltoCalif.), pRARE (Incyte Genomics), or pINCY (Incyte Genomics), orderivatives thereof. Recombinant plasmids were transformed intocompetent E. coli cells including XL1-Blue, XL1-BlueMRF, or SOLR fromStratagene or DH5α, DH10B, or ElectroMAX DH10B from Life Technologies.

[0260] II. Isolation of cDNA Clones

[0261] Plasmids obtained as described in Example I were recovered fromhost ceUs by in vivo excision using the UNIZAP vector system(Stratagene) or by cell lysis. Plasmids were purified using at least oneof the following: a Magic or WIZARD Minipreps DNA purification system(Promega); an AGTC Miniprep purification kit (Edge Biosystems,Gaithersburg Md.); and QIAWELL 8 Plasmid, QIAWELL 8 Plus Plasmid,QIAWELL 8 Ultra Plasmid purification systems or the R.E.A.L. PREP 96plasmid purification kit from QIAGEN. Following precipitation, plasmidswere resuspended in 0.1 ml of distilled water and stored, with orwithout lyophilization, at 4° C.

[0262] Alternatively, plasmid DNA was amplified from host cell lysatesusing direct link PCR in a high-throughput format (Rao, V. B. (1994)Anal. Biochem. 216:1-14). Host cell lysis and thermal cycling steps werecarried out in a single reaction mixture. Samples were processed andstored in 384-well plates, and the concentration of amplified plasmidDNA was quantified fluorometrically using PICOGREEN dye (MolecularProbes, Eugene Oreg.) and a FLUOROSKAN II fluorescence scanner(Labsystems Oy, Helsinki, Finland).

[0263] III. Sequencing and Analysis

[0264] Incyte cDNA recovered in plasmids as described in Example II weresequenced as follows. Sequencing reactions were processed using standardmethods or high-throughput instrumentation such as the ABI CATALYST 800(Applied Biosystems) thermal cycler or the PTC-200 thermal cycler (MJResearch) in conjunction with the HYDRA microdispenser (RobbinsScientific) or the MICROLAB 2200 (Hamilton) liquid transfer system. cDNAsequencing reactions were prepared using reagents provided by AmershamPharmacia Biotech or supplied in ABI sequencing kits such as the ABIPRISM BIGDYE Terminator cycle sequencing ready reaction kit (AppliedBiosystems). Electrophoretic separation of cDNA sequencing reactions anddetection of labeled polynucleotides were carried out using the MEGABACE1000 DNA sequencing system (Molecular Dynamics); the ABI PRISM 373 or377 sequencing system (Applied Biosystems) in conjunction with standardABI protocols and base calling software; or other sequence analysissystems known in the art. Reading frames within the cDNA sequences wereidentified using standard methods (reviewed in Ausubel, 1997, supra,unit 7.7). Some of the cDNA sequences were selected for extension usingthe techniques disclosed in Example VIII.

[0265] The polynucleotide sequences derived from Incyte cDNAs werevalidated by removing vector, linker, and poly(A) sequences and bymasking ambiguous bases, using algorithms and programs based on BLAST,dynamic programming, and dinucleotide nearest neighbor analysis. TheIncyte cDNA sequences or translations thereof were then queried againsta selection of public databases such as the GenBank primate, rodent,mammalian, vertebrate, and eukaryote databases, and BLOCKS, PRINTS,DOMO, PRODOM; PROTEOME databases with sequences from Homo sapiens,Rattus norveticus, Mus musculus, Caenorhabditis elegans, Saccharomvcescerevisiae, Schizosaccharomyces pombe, and Candida albicans (IncyteGenomics, Palo Alto Calif.); and hidden Markov model (HMM)-based proteinfamily databases such as PFAM. (HMM is a probabilistic approach whichanalyzes consensus primary structures of gene families. See, forexample, Eddy, S. R. (1996) Curr. Opin. Struct. Biol. 6:361-365.) Thequeries were performed using programs based on BLAST, FASTA, BLIMPS, andHMMER. The Incyte cDNA sequences were assembled to produce full lengthpolynucleotide sequences. Alternatively, GenBank cDNAs, GenBank ESTs,stitched sequences, stretched sequences, or Genscan-predicted codingsequences (see Examples IV and V) were used to extend Incyte cDNAassemblages to full length. Assembly was performed using programs basedon Phred, Phrap, and Consed, and cDNA assemblages were screened for openreading frames using programs based on GeneMark, BLAST, and FASTA. Thefull length polynucleotide sequences were translated to derive thecorresponding full length polypeptide sequences. Alternatively, apolypeptide of the invention may begin at any of the methionine residuesof the full length translated polypeptide. Full length polypeptidesequences were subsequently analyzed by querying against databases suchas the GenBank protein databases (genpept), SwissProt, the PROTEOMEdatabases, BLOCKS, PRINTS, DOMO, PRODOM, Prosite, and hidden Markovmodel (HMM)-based protein family databases such as PFAM. Full lengthpolynucleotide sequences are also analyzed using MACDNASIS PRO software(Hitachi Software Engineering, South San Francisco Calif.) and LASERGENEsoftware (DNASTAR). Polynucleotide and polypeptide sequence alignmentsare generated using default parameters specified by the CLUSTALalgorithm as incorporated into the MEGALIGN multisequence alignmentprogram (DNASTAR), which also calculates the percent identity betweenaligned sequences.

[0266] Table 7 summarizes the tools, programs, and algorithms used forthe analysis and assembly of Incyte cDNA and full length sequences andprovides applicable descriptions, references, and threshold parameters.The first column of Table 7 shows the tools, programs, and algorithmsused, the second column provides brief descriptions thereof, the thirdcolumn presents appropriate references, all of which are incorporated byreference herein in their entirety, and the fourth column presents,where applicable, the scores, probability values, and other parametersused to evaluate the strength of a match between two sequences (thehigher the score or the lower the probability value, the greater theidentity between two sequences).

[0267] The programs described above for the assembly and analysis offull length polynucleotide and polypeptide sequences were also used toidentify polynucleotide sequence fragments from SEQ ID NO:27-52.Fragments from about 20 to about 4000 nucleotides which are useful inhybridization and amplification technologies are described in Table 4,column 2.

[0268] IV. Identification and Editing of Coding Sequences from GenomicDNA

[0269] Putative molecules for disease detection and treatment wereinitially identified by running the Genscan gene identification programagainst public genomic sequence databases (e.g., gbpri and gbhtg).Genscan is a general-purpose gene identification program which analyzesgenomic DNA sequences from a variety of organisms (See Burge, C. and S.Karlin (1997) J. Mol. Biol. 268:78-94, and Burge, C. and S. Karlin(1998) Cuff. Opin. Struct. Biol. 8:346-354). The program concatenatespredicted exons to form an assembled cDNA sequence extending from amethionine to a stop codon. The output of Genscan is a FASTA database ofpolynucleotide and polypeptide sequences. The maximum range of sequencefor Genscan to analyze at once was set to 30 kb. To determine which ofthese Genscan predicted cDNA sequences encode molecules for diseasedetection and treatment, the encoded polypeptides were analyzed byquerying against PFAM models for molecules for disease detection andtreatment. Potential molecules for disease detection and treatment werealso identified by homology to Incyte cDNA sequences that had beenannotated as molecules for disease detection and treatment. Theseselected Genscan-predicted sequences were then compared by BLASTanalysis to the genpept and gbpri public databases. Where necessary, theGenscan-predicted sequences were then edited by comparison to the topBLAST hit from genpept to correct errors in the sequence predicted byGenscan, such as extra or omitted exons. BLAST analysis was also used tofind any Incyte cDNA or public cDNA coverage of the Genscan-predictedsequences, thus providing evidence for transcription. When Incyte cDNAcoverage was available, this information was used to correct or confirmthe Genscan predicted sequence. Full length polynucleotide sequenceswere obtained by assembling Genscan-predicted coding sequences withIncyte cDNA sequences and/or public cDNA sequences using the assemblyprocess described in Example III. Alternatively, full lengthpolynucleotide sequences were derived entirely from edited or uneditedGenscan-predicted coding sequences.

[0270] V. Assembly of Genomic Sequence Data with cDNA Sequence Data

[0271] “Stitched” Sequences

[0272] Partial cDNA sequences were extended with exons predicted by theGenscan gene identification program described in Example IV. PartialcDNAs assembled as described in Example III were mapped to genomic DNAand parsed into clusters containing related cDNAs and Genscan exonpredictions from one or more genomic sequences. Each cluster wasanalyzed using an algorithm based on graph theory and dynamicprogramming to integrate cDNA and genomic information, generatingpossible splice variants that were subsequently confirmed, edited, orextended to create a full length sequence. Sequence intervals in whichthe entire length of the interval was present on more than one sequencein the cluster were identified, and intervals thus identified wereconsidered to be equivalent by transitivity. For example, if an intervalwas present on a cDNA and two genomic sequences, then all threeintervals were considered to be equivalent. This process allowsunrelated but consecutive genomic sequences to be brought together,bridged by cDNA sequence. Intervals thus identified were then “stitched”together by the stitching algorithm in the order that they appear alongtheir parent sequences to generate the longest possible sequence, aswell as sequence variants. Linkages between intervals which proceedalong one type of parent sequence (cDNA to cDNA or genomic sequence togenomic sequence) were given preference over linkages which changeparent type (cDNA to genomic sequence). The resultant stitched sequenceswere translated and compared by BLAST analysis to the genpept and gbpripublic databases. Incorrect exons predicted by Genscan were corrected bycomparison to the top BLAST hit from genpept. Sequences were furtherextended with additional cDNA sequences, or by inspection of genomicDNA, when necessary.

[0273] “Stretched” Sequences

[0274] Partial DNA sequences were extended to full length with analgorithm based on BLAST analysis. First, partial cDNAs assembled asdescribed in Example m were queried against public databases such as theGenBank primate, rodent, mammalian, vertebrate, and eukaryote databasesusing the BLAST program. The nearest GenBank protein homolog was thencompared by BLAST analysis to either Incyte cDNA sequences or GenScanexon predicted sequences described in Example IV. A chimeric protein wasgenerated by using the resultant high-scoring segment pairs (HSPs) tomap the translated sequences onto the GenBank protein homolog.Insertions or deletions may occur in the chimeric protein with respectto the original GenBank protein homolog. The GenBank protein homolog,the chimeric protein, or both were used as probes to search forhomologous genomic sequences from the public human genome databases.Partial DNA sequences were therefore “stretched” or extended by theaddition of homologous genomic sequences. The resultant stretchedsequences were examined to determine whether it contained a completegene.

[0275] VI. Chromosomal Mapping of MDDT Encoding Polynucleotides

[0276] The sequences which were used to assemble SEQ ID NO:27-52 werecompared with sequences from the Incyte LIFESEQ database and publicdomain databases using BLAST and other implementations of theSmith-Waterman algorithm. Sequences from these databases that matchedSEQ ID NO:27-52 were assembled into clusters of contiguous andoverlapping sequences using assembly algorithms such as Phrap (Table 7).Radiation hybrid and genetic mapping data available from publicresources such as the Stanford Human Genome Center (SHGC), WhiteheadInstitute for Genome Research (WIGR), and Généthon were used todetermine if any of the clustered sequences had been previously mapped.Inclusion of a mapped sequence in a cluster resulted in the assignmentof all sequences of that cluster, including its particular SEQ ID NO:,to that map location.

[0277] Map locations are represented by ranges, or intervals, of humanchromosomes. The map position of an interval, in centiMorgans, ismeasured relative to the terminus of the chromosome's p-arm. (ThecentiMorgan (cM) is a unit of measurement based on recombinationfrequencies between chromosomal markers. On average, 1 cM is roughlyequivalent to 1 megabase (Mb) of DNA in humans, although this can varywidely due to hot and cold spots of recombination.) The cM distances arebased on genetic markers mapped by Généthon which provide boundaries forradiation hybrid markers whose sequences were included in each of theclusters. Human genome maps and other resources available to the public,such as the NCBI “GeneMap'99” World Wide Web site(http://www.ncbi.nlm.nih.gov/genemap/), can be employed to determine ifpreviously identified disease genes map within or in proximity to theintervals indicated above.

[0278] In this manner, SEQ ID NO:28 was mapped to chromosome 1 withinthe interval from 137.6 to 149.0 centiMorgans; SEQ ID NO:30 was mappedto chromosome 9 within the interval from 50.3 to 75.8 centiMorgans; SEQID NO:35 was mapped to chromosome 4 within the interval from 61.0 to62.7 centiMorgans; SEQ ID NO:37 was mapped to chromosome 4 within theinterval from 145.3 to 170.9 centiMorgans; SEQ ID NO:38 was mapped tochromosome 4 within the interval from 81.9 to 115.1 centiMorgans; andSEQ ID NO:40 was mapped to chromosome 7 within the interval from 47.9 to62.8 centiMorgans.

[0279] VII. Analysis of Polynucleotide Expression

[0280] Northern analysis is a laboratory technique used to detect thepresence of a transcript of a gene and involves the hybridization of alabeled nucleotide sequence to a membrane on which RNAs from aparticular cell type or tissue have been bound. (See, e.g., Sambrook,supra, ch. 7; Ausubel (1995) supra, ch. 4 and 16.)

[0281] Analogous computer techniques applying BLAST were used to searchfor identical or related molecules in cDNA databases such as GenBank orLIFESEQ (Incyte Genomics). This analysis is much faster than multiplemembrane-based hybridizations. In addition, the sensitivity of thecomputer search can be modified to determine whether any particularmatch is categorized as exact or similar. The basis of the search is theproduct score, which is defined as:$\frac{{BLAST}\quad {Score} \times {Percent}\quad {Identity}}{5 \times {minimum}\quad \left\{ {{{length}\left( {{Seq}.\quad 1} \right)},{{length}\left( {{Seq}.\quad 2} \right)}} \right\}}$

[0282] The product score takes into account both the degree ofsimilarity between two sequences and the length of the sequence match.The product score is a normalized value between 0 and 100, and iscalculated as follows: the BLAST score is multiplied by the percentnucleotide identity and the product is divided by (5 times the length ofthe shorter of the two sequences). The BLAST score is calculated byassigning a score of +5 for every base that matches in a high-scoringsegment pair (HSP), and −4 for every mismatch. Two sequences may sharemore than one HSP (separated by gaps). If there is more than one HSP,then the pair with the highest BLAST score is used to calculate theproduct score. The product score represents a balance between fractionaloverlap and quality in a BLAST alignment. For example, a product scoreof 100 is produced only for 100% identity over the entire length of theshorter of the two sequences being compared. A product score of 70 isproduced either by 100% identity and 70% overlap at one end, or by 88%identity and 100% overlap at the other. A product score of 50 isproduced either by 100% identity and 50% overlap at one end, or 79%identity and 100% overlap.

[0283] Alternatively, polynucleotide sequences encoding MDDT areanalyzed with respect to the tissue sources from which they werederived. For example, some full length sequences are assembled, at leastin part, with overlapping Incyte cDNA sequences (see Example III). EachcDNA sequence is derived from a cDNA library constructed from a humantissue. Each human tissue is classified into one of the followingorgan/tissue categories: cardiovascular system; connective tissue;digestive system; embryonic structures; endocrine system; exocrineglands; genitalia, female; genitalia, male; germ cells; hemic and immunesystem; liver; musculoskeletal system; nervous system; pancreas;respiratory system; sense organs; skin; stomatognathic system;unclassified/mixed; or urinary tract. The number of libraries in eachcategory is counted and divided by the total number of libraries acrossall categories. Similarly, each human tissue is classified into one ofthe following disease/condition categories: cancer, cell line,developmental, inflammation, neurological, trauma, cardiovascular,pooled, and other, and the number of libraries in each category iscounted and divided by the total number of libraries across allcategories. The resulting percentages reflect the tissue- anddisease-specific expression of cDNA encoding MDDT. cDNA sequences andcDNA library/tissue information are found in the LIFESEQ GOLD database(Incyte Genomics, Palo Alto Calif.).

[0284] VIII. Extension of MDDT Encoding Polynucleotides

[0285] Full length polynucleotide sequences were also produced byextension of an appropriate

1 52 1 127 PRT Homo sapiens misc_feature Incyte ID No 6242606CD1 1 MetLeu Thr Gln Gln Gly Met Ala Leu Gln Asn Tyr Asp Asn Lys 1 5 10 15 LeuVal Lys Cys Ile Glu Glu Leu Cys Gln Lys Gln Glu Glu Leu 20 25 30 Cys TrpGln Ile Gln Gln Glu Glu Asp Lys Lys Gln Arg Leu Gln 35 40 45 Asn Glu ValArg Gln Leu Thr Glu Lys Leu Ala Cys Val Asn Glu 50 55 60 Lys Leu Ala ArgVal Asn Glu Asn Leu Ala Arg Lys Ile Ala Ser 65 70 75 Cys Ser Lys Phe TyrGln Thr Ile Ala Glu Thr Glu Ala Thr Tyr 80 85 90 Leu Lys Met Leu Glu SerSer Gln Thr Leu Leu Ser Val Leu Lys 95 100 105 Arg Glu Ala Gly Asn LeuThr Lys Ala Thr Ala Ser Asp Gln Lys 110 115 120 Ser Ser Gly Gly Arg AspSer 125 2 258 PRT Homo sapiens misc_feature Incyte ID No 997105CD1 2 MetSer Asn Asn Leu Arg Arg Val Phe Leu Lys Pro Ala Glu Glu 1 5 10 15 AsnSer Gly Asn Ala Ser Arg Cys Val Ser Gly Cys Met Tyr Gln 20 25 30 Val ValGln Thr Ile Gly Ser Asp Gly Lys Asn Leu Leu Gln Leu 35 40 45 Leu Pro IlePro Lys Ser Ser Gly Asn Leu Ile Pro Leu Val Gln 50 55 60 Ser Ser Val MetSer Asp Ala Leu Lys Gly Asn Thr Gly Lys Pro 65 70 75 Val Gln Val Thr PheGln Thr Gln Ile Ser Ser Ser Ser Thr Ser 80 85 90 Ala Ser Val Gln Leu ProIle Phe Gln Pro Ala Ser Ser Ser Asn 95 100 105 Tyr Phe Leu Thr Arg ThrVal Asp Thr Ser Glu Lys Gly Arg Val 110 115 120 Thr Ser Val Gly Thr GlyAsn Phe Ser Ser Ser Val Ser Lys Val 125 130 135 Gln Ser His Gly Val LysIle Asp Gly Leu Thr Met Gln Thr Phe 140 145 150 Ala Val Pro Pro Ser ThrGln Lys Asp Ser Ser Phe Ile Val Val 155 160 165 Asn Thr Gln Ser Leu ProVal Thr Val Lys Ser Pro Val Leu Pro 170 175 180 Ser Gly His His Leu GlnIle Pro Ala His Ala Glu Val Lys Ser 185 190 195 Val Pro Ala Ser Ser LeuPro Pro Ser Val Gln Gln Lys Ile Leu 200 205 210 Ala Thr Ala Thr Thr SerThr Ser Gly Met Val Glu Ala Ser Gln 215 220 225 Met Pro Thr Val Ile TyrVal Ser Pro Val Ile Leu Thr Leu Trp 230 235 240 Lys Ser Glu Ala Gly GlySer Leu Glu Ala Arg Ser Ser Ala Gly 245 250 255 Gln Trp Gln 3 175 PRTHomo sapiens misc_feature Incyte ID No 1482843CD1 3 Met Gly Leu Ser ProSer Ser Thr Arg Gly Cys Leu Leu Ser Cys 1 5 10 15 Glu Ala Ser Pro GlyPro Gly Ser Arg Gly Val Cys Thr Pro His 20 25 30 Thr His Ile Pro Arg ProTyr Arg Thr Gly Cys Gln Met Ala Leu 35 40 45 Val Ser Ser Cys Pro Trp TrpAsp Arg Val Leu Leu His Ala Pro 50 55 60 Ser Glu Gln Leu Cys Glu Val GlyArg Glu Asn Ala Thr Ile Leu 65 70 75 Cys Asn Gln Trp Glu Asn Arg Gly ThrGlu Gly Trp His Gly Pro 80 85 90 Gly Leu Val Cys Pro Thr Ser Leu Val LeuGly Met Val Trp Gly 95 100 105 Met Asp Arg Gly Ala Pro Gly Cys Arg AlaAsp Pro Gln Pro Ser 110 115 120 Ser Ser Arg Gly Thr Gly Ser Ala Thr ArgThr Pro Ser Cys Trp 125 130 135 Met Glu Thr Leu Pro Met Phe Ser Pro CysVal Ala Lys Asp Gly 140 145 150 Ala Val Gln Pro Asp Gly Arg Tyr Gly ValGly Arg Arg Ala Pro 155 160 165 Gly Lys Gly Leu Pro Thr Ala Arg Met His170 175 4 1146 PRT Homo sapiens misc_feature Incyte ID No 3218219CD1 4Met Gly Pro Ala Pro Ala Gly Glu Gln Leu Arg Gly Ala Thr Gly 1 5 10 15Glu Pro Glu Val Met Glu Pro Ala Leu Glu Gly Thr Gly Lys Glu 20 25 30 GlyLys Lys Ala Ser Ser Arg Lys Arg Thr Leu Ala Glu Pro Pro 35 40 45 Ala LysGly Leu Leu Gln Pro Val Lys Leu Ser Arg Ala Glu Leu 50 55 60 Tyr Lys GluPro Thr Asn Glu Glu Leu Asn Arg Leu Arg Glu Thr 65 70 75 Glu Ile Leu PheHis Ser Ser Leu Leu Arg Leu Gln Val Glu Glu 80 85 90 Leu Leu Lys Glu ValArg Leu Ser Glu Lys Lys Lys Asp Arg Ile 95 100 105 Asp Ala Phe Leu ArgGlu Val Asn Gln Arg Val Val Arg Val Pro 110 115 120 Ser Val Pro Glu ThrGlu Leu Thr Asp Gln Ala Trp Leu Pro Ala 125 130 135 Gly Val Arg Val ProLeu His Gln Val Pro Tyr Ala Val Lys Gly 140 145 150 Cys Phe Arg Phe LeuPro Pro Ala Gln Val Thr Val Val Gly Ser 155 160 165 Tyr Leu Leu Gly ThrCys Ile Arg Pro Asp Ile Asn Val Asp Val 170 175 180 Ala Leu Thr Met ProArg Glu Ile Leu Gln Asp Lys Asp Gly Leu 185 190 195 Asn Gln Arg Tyr PheArg Lys Arg Ala Leu Tyr Leu Ala His Leu 200 205 210 Ala His His Leu AlaGln Asp Pro Leu Phe Gly Ser Val Cys Phe 215 220 225 Ser Tyr Thr Asn GlyCys His Leu Lys Pro Ser Leu Leu Leu Arg 230 235 240 Pro Arg Gly Lys AspGlu Arg Leu Val Thr Val Arg Leu His Pro 245 250 255 Cys Pro Pro Pro AspPhe Phe Arg Pro Cys Arg Leu Leu Pro Thr 260 265 270 Lys Asn Asn Val ArgSer Ala Trp Tyr Arg Gly Gln Ser Pro Ala 275 280 285 Gly Asp Gly Ser ProGlu Pro Pro Thr Pro Arg Tyr Asn Thr Trp 290 295 300 Val Leu Gln Asp ThrVal Leu Glu Ser His Leu Gln Leu Leu Ser 305 310 315 Thr Ile Leu Ser SerAla Gln Gly Leu Lys Asp Gly Val Ala Leu 320 325 330 Leu Lys Val Trp LeuArg Gln Arg Glu Leu Asp Lys Gly Gln Gly 335 340 345 Gly Phe Thr Gly PheLeu Val Ser Met Leu Val Val Phe Leu Val 350 355 360 Ser Thr Arg Lys IleHis Thr Thr Met Ser Gly Tyr Gln Val Leu 365 370 375 Arg Ser Val Leu GlnPhe Leu Ala Thr Thr Asp Leu Thr Val Asn 380 385 390 Gly Ile Ser Leu CysLeu Ser Ser Asp Pro Ser Leu Pro Ala Leu 395 400 405 Ala Asp Phe His GlnAla Phe Ser Val Val Phe Leu Asp Ser Ser 410 415 420 Gly His Leu Asn LeuCys Ala Asp Val Thr Ala Ser Thr Tyr His 425 430 435 Gln Val Gln His GluAla Arg Leu Ser Met Met Leu Leu Asp Ser 440 445 450 Arg Ala Asp Asp GlyPhe His Leu Leu Leu Met Thr Pro Lys Pro 455 460 465 Met Ile Arg Ala PheAsp His Val Leu His Leu Arg Pro Leu Ser 470 475 480 Arg Leu Gln Ala AlaCys His Arg Leu Lys Leu Trp Pro Glu Leu 485 490 495 Gln Asp Asn Gly GlyAsp Tyr Val Ser Ala Ala Leu Gly Pro Leu 500 505 510 Thr Thr Leu Leu GluGln Gly Leu Gly Ala Arg Leu Asn Leu Leu 515 520 525 Ala His Ser Arg ProPro Val Pro Glu Trp Asp Ile Ser Gln Asp 530 535 540 Pro Pro Lys His LysAsp Ser Gly Thr Leu Thr Leu Gly Leu Leu 545 550 555 Leu Arg Pro Glu GlyLeu Thr Ser Val Leu Glu Leu Gly Pro Glu 560 565 570 Ala Asp Gln Pro GluAla Ala Lys Phe Arg Gln Phe Trp Gly Ser 575 580 585 Arg Ser Glu Leu ArgArg Phe Gln Asp Gly Ala Ile Arg Glu Ala 590 595 600 Val Val Trp Glu AlaAla Ser Met Ser Gln Lys Arg Leu Ile Pro 605 610 615 His Gln Val Val ThrHis Leu Leu Ala Leu His Ala Asp Ile Pro 620 625 630 Glu Thr Cys Val HisTyr Val Gly Gly Pro Leu Asp Ala Leu Ile 635 640 645 Gln Gly Leu Lys GluThr Ser Ser Thr Gly Glu Glu Ala Leu Val 650 655 660 Ala Ala Val Arg CysTyr Asp Asp Leu Ser Arg Leu Leu Trp Gly 665 670 675 Leu Glu Gly Leu ProLeu Thr Val Ser Ala Val Gln Gly Ala His 680 685 690 Pro Val Leu Arg TyrThr Glu Val Phe Pro Pro Thr Pro Val Arg 695 700 705 Pro Ala Phe Ser PheTyr Glu Thr Leu Arg Glu Arg Ser Ser Leu 710 715 720 Leu Pro Arg Leu AspLys Pro Cys Pro Ala Tyr Val Glu Pro Met 725 730 735 Thr Val Val Cys HisLeu Glu Gly Ser Gly Gln Trp Pro Gln Asp 740 745 750 Ala Glu Ala Val GlnArg Val Arg Ala Ala Phe Gln Leu Arg Leu 755 760 765 Ala Glu Leu Leu ThrGln Gln His Gly Leu Gln Cys Arg Ala Thr 770 775 780 Ala Thr His Thr AspVal Leu Lys Asp Gly Phe Val Phe Arg Ile 785 790 795 Arg Val Ala Tyr GlnArg Glu Pro Gln Ile Leu Lys Glu Val Gln 800 805 810 Ser Pro Glu Gly MetIle Ser Leu Arg Asp Thr Ala Ala Ser Leu 815 820 825 Arg Leu Glu Arg AspThr Arg Gln Leu Pro Leu Leu Thr Ser Ala 830 835 840 Leu His Gly Leu GlnGln Gln His Pro Ala Phe Ser Gly Val Ala 845 850 855 Arg Leu Ala Lys ArgTrp Val Arg Ala Gln Leu Leu Gly Glu Gly 860 865 870 Phe Ala Asp Glu SerLeu Asp Leu Val Ala Ala Ala Leu Phe Leu 875 880 885 His Pro Glu Pro PheThr Pro Pro Ser Ser Pro Gln Val Gly Phe 890 895 900 Leu Arg Phe Leu PheLeu Val Ser Thr Phe Asp Trp Lys Asn Asn 905 910 915 Pro Leu Phe Val AsnLeu Asn Asn Glu Leu Thr Val Glu Glu Gln 920 925 930 Val Glu Ile Arg SerGly Phe Leu Ala Ala Arg Ala Gln Leu Pro 935 940 945 Val Met Val Ile ValThr Pro Gln Asp Arg Lys Asn Ser Val Trp 950 955 960 Thr Gln Asp Gly ProSer Ala Gln Ile Leu Gln Gln Leu Val Val 965 970 975 Leu Ala Ala Glu AlaLeu Pro Met Leu Glu Lys Gln Leu Met Asp 980 985 990 Pro Arg Gly Pro GlyAsp Ile Arg Thr Val Phe Arg Pro Pro Leu 995 1000 1005 Asp Ile Tyr AspVal Leu Ile Arg Leu Ser Pro Arg His Ile Pro 1010 1015 1020 Arg His ArgGln Ala Val Asp Ser Pro Ala Ala Ser Phe Cys Arg 1025 1030 1035 Gly LeuLeu Ser Gln Pro Gly Pro Ser Ser Leu Met Pro Val Leu 1040 1045 1050 GlyTyr Asp Pro Pro Gln Leu Tyr Leu Thr Gln Leu Arg Glu Ala 1055 1060 1065Phe Gly Asp Leu Ala Leu Phe Phe Tyr Asp Gln His Gly Gly Glu 1070 10751080 Val Ile Gly Val Leu Trp Lys Pro Thr Ser Phe Gln Pro Gln Pro 10851090 1095 Phe Lys Ala Ser Ser Thr Lys Gly Arg Met Val Met Ser Arg Gly1100 1105 1110 Gly Glu Leu Val Met Val Pro Asn Val Glu Ala Ile Leu GluAsp 1115 1120 1125 Phe Ala Val Leu Gly Glu Gly Leu Val Gln Thr Val GluAla Arg 1130 1135 1140 Ser Glu Arg Trp Thr Val 1145 5 351 PRT Homosapiens misc_feature Incyte ID No 7371678CD1 5 Met Pro Lys Leu Val LysAsn Leu Leu Gly Glu Met Pro Leu Trp 1 5 10 15 Val Cys Gln Ser Cys ArgLys Ser Met Glu Glu Asp Glu Arg Gln 20 25 30 Thr Gly Arg Glu His Ala ValAla Ile Ser Leu Ser His Thr Ser 35 40 45 Cys Lys Ser Gln Ser Cys Gly AspAsp Ser His Ser Ser Ser Ser 50 55 60 Ser Ser Ser Ser Ser Ser Ser Ser SerSer Ser Ser Cys Pro Gly 65 70 75 Asn Ser Gly Asp Trp Asp Pro Ser Ser PheLeu Ser Ala His Lys 80 85 90 Leu Ser Gly Leu Trp Asn Ser Pro His Ser SerGly Ala Met Pro 95 100 105 Gly Ser Ser Leu Gly Ser Pro Pro Thr Ile ProGly Glu Ala Phe 110 115 120 Pro Val Ser Glu His His Gln His Ser Asp LeuThr Ala Pro Pro 125 130 135 Asn Ser Pro Thr Gly His His Pro Gln Pro AlaSer Leu Ile Pro 140 145 150 Ser His Pro Ser Ser Phe Gly Ser Pro Pro HisPro His Leu Leu 155 160 165 Pro Thr Thr Pro Ala Ala Pro Phe Pro Ala GlnAla Ser Glu Cys 170 175 180 Pro Val Ala Ala Ala Thr Ala Pro His Thr ProGly Pro Cys Gln 185 190 195 Ser Ser His Leu Pro Ser Thr Ser Met Pro LeuLeu Lys Met Pro 200 205 210 Pro Pro Phe Ser Gly Cys Ser His Pro Cys SerGly His Cys Gly 215 220 225 Gly His Cys Ser Gly Pro Leu Leu Pro Pro ProSer Ser Gln Pro 230 235 240 Leu Pro Ser Thr His Arg Asp Pro Gly Cys LysGly His Lys Phe 245 250 255 Ala His Ser Gly Leu Ala Cys Gln Leu Pro GlnPro Cys Glu Ala 260 265 270 Asp Glu Gly Leu Gly Glu Glu Glu Asp Ser SerSer Glu Arg Ser 275 280 285 Ser Cys Thr Ser Ser Ser Thr His Gln Arg AspGly Lys Phe Cys 290 295 300 Asp Cys Cys Tyr Cys Glu Phe Phe Gly His AsnAla Val Ser Glu 305 310 315 Pro Ala Gln Ala Arg Glu Gly Trp Gly Pro AlaLys Val Ser Lys 320 325 330 Arg Ser Thr Leu Leu Ser Pro Glu Thr Pro LeuLeu Thr Val Val 335 340 345 Ile Thr Val Leu Gly Pro 350 6 286 PRT Homosapiens misc_feature Incyte ID No 7473883CD1 6 Met Ser Leu Pro Ile GlyIle Tyr Arg Arg Ala Val Ser Tyr Asp 1 5 10 15 Asp Thr Leu Glu Asp ProAla Pro Met Thr Pro Pro Pro Ser Asp 20 25 30 Met Gly Ser Val Pro Trp LysPro Val Ile Pro Glu Arg Lys Tyr 35 40 45 Gln His Leu Ala Lys Val Glu GluGly Glu Ala Ser Leu Pro Ser 50 55 60 Pro Ala Met Thr Leu Ser Ser Ala IleAsp Ser Val Asp Lys Val 65 70 75 Pro Val Val Lys Ala Lys Ala Thr His ValIle Met Asn Ser Leu 80 85 90 Ile Thr Lys Gln Thr Gln Glu Ser Ile Gln HisPhe Glu Arg Gln 95 100 105 Ala Gly Leu Arg Asp Ala Gly Tyr Thr Pro HisLys Gly Leu Thr 110 115 120 Thr Glu Glu Thr Lys Tyr Leu Arg Val Ala GluAla Leu His Lys 125 130 135 Leu Lys Leu Gln Ser Gly Glu Val Thr Lys GluGlu Arg Gln Pro 140 145 150 Ala Ser Ala Gln Ser Thr Pro Ser Thr Thr ProHis Ser Ser Pro 155 160 165 Lys Gln Arg Pro Arg Gly Trp Phe Thr Ser GlySer Ser Thr Ala 170 175 180 Leu Pro Gly Pro Asn Pro Ser Thr Met Asp SerGly Ser Gly Asp 185 190 195 Lys Asp Arg Asn Leu Ser Asp Lys Trp Ser LeuPhe Gly Pro Arg 200 205 210 Ser Leu Gln Lys Tyr Asp Ser Gly Ser Phe AlaThr Gln Ala Tyr 215 220 225 Arg Gly Ala Gln Lys Pro Ser Pro Leu Glu LeuIle Arg Ala Gln 230 235 240 Ala Asn Arg Met Ala Glu Asp Pro Ala Ala LeuLys Pro Pro Lys 245 250 255 Met Asp Ile Pro Val Met Glu Gly Lys Lys GlnPro Pro Arg Ala 260 265 270 His Asn Leu Lys Pro Arg Asp Leu Asn Val LeuThr Pro Thr Gly 275 280 285 Phe 7 205 PRT Homo sapiens misc_featureIncyte ID No 7478662CD1 7 Met Pro Asp Thr Ala Phe Pro Pro Ser Ser ArgAla Gly Gln Thr 1 5 10 15 Gly Pro Val Val Ser Gly Ala Gln Val Ser SerTrp Arg Glu Arg 20 25 30 Gln Pro Cys Ser Gly Ser Arg Gly Pro Ser His IleLeu Gly Thr 35 40 45 Asp Ala Asn Ala Gly Thr Ala Gly Lys Leu Gly Leu ValSer Phe 50 55 60 Pro Gly Phe Arg Lys Lys Pro Thr Arg Ser Val Leu Gln AsnGlu 65 70 75 Thr Phe Val Ser Arg Ser Thr Ala Asp Cys Lys Trp Leu Pro Phe80 85 90 Val Ala Val Ser Ile Leu Ser Ile Leu Arg Leu Asp Trp Phe Leu 95100 105 Leu Leu Ala Leu Pro Cys Pro Thr Asp His Lys Gly Glu Gln Gln 110115 120 Glu Ala Pro Ser Lys His Pro Gln Met Ala Leu Asp Ile Ser His 125130 135 Ile Leu Arg Asn Met Ser Cys Ser Gly Arg Ala Lys Ala Ser Ser 140145 150 Lys Ala Cys Gly Ala Gly Gly Ser Gln Ala Arg Trp Gly Asn Pro 155160 165 Glu Pro Trp Gly Leu Pro Met Gly Ile Gly Arg Ser Gln Gly Arg 170175 180 Val Arg Gly Ser Thr Gly Gly Val Arg Glu Ser Pro Arg Ala Leu 185190 195 Leu Ala Gln Gly Ala Val Asn Thr Met Asp 200 205 8 233 PRT Homosapiens misc_feature Incyte ID No 7650474CD1 8 Met Ser Lys Leu Gly LysPhe Phe Lys Gly Gly Gly Ser Ser Lys 1 5 10 15 Ser Arg Ala Ala Pro SerPro Gln Glu Ala Leu Val Arg Leu Arg 20 25 30 Glu Thr Glu Glu Met Leu GlyLys Lys Gln Glu Tyr Leu Glu Asn 35 40 45 Arg Ile Gln Arg Glu Ile Ala LeuAla Lys Lys His Gly Thr Gln 50 55 60 Asn Lys Arg Ala Ala Leu Gln Ala LeuLys Arg Lys Lys Arg Phe 65 70 75 Glu Lys Gln Leu Thr Gln Ile Asp Gly ThrLeu Ser Thr Ile Glu 80 85 90 Phe Gln Arg Glu Ala Leu Glu Asn Ser His ThrAsn Thr Glu Val 95 100 105 Leu Arg Asn Met Gly Phe Ala Ala Lys Ala MetLys Ser Val His 110 115 120 Glu Asn Met Asp Leu Asn Lys Ile Asp Asp LeuMet Gln Glu Ile 125 130 135 Thr Glu Gln Gln Asp Ile Ala Gln Glu Ile SerGlu Ala Phe Ser 140 145 150 Gln Arg Val Gly Phe Gly Asp Asp Phe Asp GluAsp Glu Leu Met 155 160 165 Ala Glu Leu Glu Glu Leu Glu Gln Glu Glu LeuAsn Lys Lys Met 170 175 180 Thr Asn Ile Arg Leu Pro Asn Val Pro Ser SerSer Leu Pro Ala 185 190 195 Gln Pro Asn Arg Lys Pro Gly Met Ser Ser ThrAla Arg Arg Ser 200 205 210 Arg Ala Ala Ser Ser Gln Arg Ala Glu Glu GluAsp Asp Asp Ile 215 220 225 Lys Gln Leu Ala Ala Trp Ala Thr 230 9 205PRT Homo sapiens misc_feature Incyte ID No 8092378CD1 9 Met Ile Val ValLeu His Val His Phe His Met Ala Met Leu Pro 1 5 10 15 Phe Pro Ile PheLeu Val Leu Leu Leu Arg Gly Leu Val Leu Trp 20 25 30 Thr Pro Ala Ser SerGly Thr Ile Met Pro Glu Glu Arg Lys Thr 35 40 45 Glu Ile Glu Arg Glu ThrGlu Thr Glu Ser Glu Thr Val Ile Gly 50 55 60 Thr Glu Lys Glu Asn Ala ProGlu Arg Glu Arg Gly Ser Val Ile 65 70 75 Thr Val Leu His Gln Val Phe SerThr Ala Met Lys Asn Asp Thr 80 85 90 Asp Thr Gly Asn Met Gln Lys Glu ValMet Ser Val Thr Glu Gln 95 100 105 Val Glu Lys Lys Lys Asn Asp Ile GluLys Asp Asp Thr Gly Arg 110 115 120 Lys Arg Lys Pro Asp Ile Ser Leu LeuGlu Val Ile Val Asp Val 125 130 135 Ala Met Lys Val Lys Lys Glu Ile ValThr Gly Asp Thr Asn Thr 140 145 150 Lys Asn Leu Lys Glu Ala Lys Lys GluLys Lys Arg Ala Val Ser 155 160 165 Leu Pro Leu Asn Arg Arg Ala Pro LysLeu His Leu Gln Asn Arg 170 175 180 His Gly Phe Gly Leu Leu Cys Ile LeuVal Pro Glu Val Asp Thr 185 190 195 Ile Asn Leu Val Ile Phe Leu Asp AsnVal 200 205 10 304 PRT Homo sapiens misc_feature Incyte ID No 1263178CD110 Met Gly Gln Cys Val Thr Lys Cys Lys Asn Pro Ser Ser Thr Leu 1 5 10 15Gly Ser Lys Asn Gly Asp Arg Glu Pro Ser Asn Lys Ser His Ser 20 25 30 ArgArg Gly Ala Gly His Arg Glu Glu Gln Val Pro Pro Cys Gly 35 40 45 Lys ProGly Gly Asp Ile Leu Val Asn Gly Thr Lys Lys Ala Glu 50 55 60 Ala Ala ThrGlu Ala Cys Gln Leu Pro Thr Ser Ser Gly Asp Ala 65 70 75 Gly Arg Glu SerLys Ser Asn Ala Glu Glu Ser Ser Leu Gln Arg 80 85 90 Leu Glu Glu Leu PheArg Arg Tyr Lys Asp Glu Arg Glu Asp Ala 95 100 105 Ile Leu Glu Glu GlyMet Glu Arg Phe Cys Asn Asp Leu Cys Val 110 115 120 Asp Pro Thr Glu PheArg Val Leu Leu Leu Ala Trp Lys Phe Gln 125 130 135 Ala Ala Thr Met CysLys Phe Thr Arg Lys Glu Phe Phe Asp Gly 140 145 150 Cys Lys Ala Ile SerAla Asp Ser Ile Asp Gly Ile Cys Ala Arg 155 160 165 Phe Pro Ser Leu LeuThr Glu Ala Lys Gln Glu Asp Lys Phe Lys 170 175 180 Asp Leu Tyr Arg PheThr Phe Gln Phe Gly Leu Asp Ser Glu Glu 185 190 195 Gly Gln Arg Ser LeuHis Arg Glu Ile Ala Ile Ala Leu Trp Lys 200 205 210 Leu Val Phe Thr GlnAsn Asn Pro Pro Val Leu Asp Gln Trp Leu 215 220 225 Asn Phe Leu Thr GluAsn Pro Ser Gly Ile Lys Gly Ile Ser Arg 230 235 240 Asp Thr Trp Asn MetPhe Leu Asn Phe Thr Gln Val Ile Gly Pro 245 250 255 Asp Leu Ser Asn TyrSer Glu Asp Glu Ala Trp Pro Ser Leu Phe 260 265 270 Asp Thr Phe Val GluTrp Glu Met Glu Arg Arg Lys Arg Glu Gly 275 280 285 Glu Gly Arg Gly AlaLeu Ser Ser Gly Pro Glu Gly Leu Cys Pro 290 295 300 Glu Glu Gln Thr 11418 PRT Homo sapiens misc_feature Incyte ID No 3535417CD1 11 Met Pro MetLeu Leu Pro His Pro His Gln His Phe Leu Lys Gly 1 5 10 15 Leu Leu ArgAla Pro Phe Arg Cys Tyr His Phe Ile Phe His Ser 20 25 30 Ser Thr His LeuGly Ser Gly Ile Pro Cys Ala Gln Pro Phe Asn 35 40 45 Ser Leu Gly Leu HisCys Thr Lys Trp Met Leu Leu Ser Asp Gly 50 55 60 Leu Lys Arg Lys Leu CysVal Gln Thr Thr Leu Lys Asp His Thr 65 70 75 Glu Gly Leu Ser Asp Lys GluGln Arg Phe Val Asp Lys Leu Tyr 80 85 90 Thr Gly Leu Ile Gln Gly Gln ArgAla Cys Leu Ala Glu Ala Ile 95 100 105 Thr Leu Val Glu Ser Thr His SerArg Lys Lys Glu Leu Ala Gln 110 115 120 Val Leu Leu Gln Lys Val Leu LeuTyr His Arg Glu Gln Glu Gln 125 130 135 Ser Asn Lys Gly Lys Pro Leu AlaPhe Arg Val Gly Leu Ser Gly 140 145 150 Pro Pro Gly Ala Gly Lys Ser ThrPhe Ile Glu Tyr Phe Gly Lys 155 160 165 Met Leu Thr Glu Arg Gly His LysLeu Ser Val Leu Ala Val Asp 170 175 180 Pro Ser Ser Cys Thr Ser Gly GlySer Leu Leu Gly Asp Lys Thr 185 190 195 Arg Met Thr Glu Leu Ser Arg AspMet Asn Ala Tyr Ile Arg Pro 200 205 210 Ser Pro Thr Arg Gly Thr Leu GlyGly Val Thr Arg Thr Thr Asn 215 220 225 Glu Ala Ile Leu Leu Cys Glu GlyAla Gly Tyr Asp Ile Ile Leu 230 235 240 Ile Glu Thr Val Gly Val Gly GlnSer Glu Phe Ala Val Ala Asp 245 250 255 Met Val Asp Met Phe Val Leu LeuLeu Pro Pro Ala Gly Gly Asp 260 265 270 Glu Leu Gln Gly Ile Lys Arg GlyIle Ile Glu Met Ala Asp Leu 275 280 285 Val Ala Val Thr Lys Ser Asp GlyAsp Leu Ile Val Pro Ala Arg 290 295 300 Arg Ile Gln Ala Glu Tyr Val SerAla Leu Lys Leu Leu Arg Lys 305 310 315 Arg Ser Gln Val Trp Lys Pro LysVal Ile Arg Ile Ser Ala Arg 320 325 330 Ser Gly Glu Gly Ile Ser Glu MetTrp Asp Lys Met Lys Asp Phe 335 340 345 Gln Asp Leu Met Leu Ala Ser GlyGlu Leu Thr Ala Lys Arg Arg 350 355 360 Lys Gln Gln Lys Val Trp Met TrpAsn Leu Ile Gln Glu Ser Val 365 370 375 Leu Glu His Phe Arg Thr His ProThr Val Arg Glu Gln Ile Pro 380 385 390 Leu Leu Glu Gln Lys Val Leu IleGly Ala Leu Ser Pro Gly Leu 395 400 405 Ala Ala Asp Phe Leu Leu Lys AlaPhe Lys Ser Arg Asp 410 415 12 1117 PRT Homo sapiens misc_feature IncyteID No 7473555CD1 12 Met Phe Gln Thr Glu Leu Met Ala Leu Pro His Thr TyrSer Phe 1 5 10 15 Leu Tyr Ser Val Ser Asp Ile Thr Cys Pro Pro Arg GlnAla Pro 20 25 30 Gly Ser His Leu Val Thr Asn Pro Val Thr Val Val Ala ProVal 35 40 45 Ala Pro Gly Val Leu Pro Val Leu Arg Arg Gln Cys Cys Met Met50 55 60 Gly Asn Ala Cys Leu Asn Ala Leu Ala Gly Thr Met Leu Met Pro 6570 75 Leu Ala Gly Ala Lys Phe Val Ile Thr His Val Pro Ala Ala Leu 80 8590 Gly Pro His Pro Leu Thr Val Gln Pro Ala Ala Pro Pro Arg Leu 95 100105 Cys Val Lys Ala Thr Val Cys Pro Ala Val Glu Arg Val Ser Thr 110 115120 Leu Thr Thr Glu Ser Ala Lys Lys Pro Glu Glu Gly Leu Gln Val 125 130135 Glu Gln Leu Ser Gly Val Gly Ile Pro Ser Gly Glu Cys Leu Ala 140 145150 Gln Cys Arg Ala His Phe Tyr Leu Glu Ser Thr Gly Leu Cys Glu 155 160165 Glu His Pro Glu Leu Trp Lys Leu Ala Ile Ala Met Ser Glu Leu 170 175180 Arg Val Trp Glu Gly Lys Thr Ile Leu Thr Val Val Pro Thr Thr 185 190195 Ile Pro Leu Ser Gln Tyr Gln Arg Arg Pro Arg His Ser Ala Leu 200 205210 Leu Thr Ser Asn Leu Thr Val Pro Ile Gln Ser Cys Val Lys Pro 215 220225 Pro Tyr Met Leu Leu Val Gly Asn Ile Lys Ile Trp Thr Asn Asn 230 235240 Gln Ile Val Gln Cys Ile Asn Cys His Leu Tyr Thr Cys Ile Asn 245 250255 Ser His Phe Asp Ser Arg Lys Ser Val Met Leu Val Arg Ala Arg 260 265270 Glu Gly Ile Trp Ile Pro Leu Ala Thr Ser Pro Val Ser Asp Val 275 280285 Gln Gly Lys Ala His Ile Thr Ala Gln Thr Val Gly Leu Pro Met 290 295300 Cys Cys Trp Met Gly Ser Ala Ser Pro Ser Ala Gln Met Ala Thr 305 310315 Phe Thr Arg Lys Val Val Ala Gln Ser Val Thr Gln Pro Ala Gly 320 325330 Ser Val Met Gly Leu Trp Ser Leu Thr Ala Ser Pro Val Thr Leu 335 340345 Thr Ser Leu Leu Pro Met Val Thr Ala Gly Pro Ala Ala Gly Lys 350 355360 Ser Ser Ser Ser Thr Ser Trp Asp Thr Val Leu Thr Ala Ile Thr 365 370375 Cys Ala Ser Thr Val Gln Leu Ile Ser Thr Thr Leu Gly Ala Ser 380 385390 Ala Ser Gly Ala Arg Met Pro Thr Thr Cys Cys Ser Gly Thr Thr 395 400405 Val Phe Leu Thr Ala Leu Gln Asp Thr Met Gln Arg Glu Glu Leu 410 415420 Val Lys Asn Ala Thr Pro Pro Ala Glu Pro Ala Arg Ala Glu Asp 425 430435 Leu Ser Pro Ala Pro His Val Thr Pro Thr Ser Cys Cys Pro Thr 440 445450 Leu Ala Pro Ala Ala Pro Pro Ala Ser Leu Gly Thr Ile Leu Met 455 460465 Thr Ile Met Phe Ala Ser Thr Gln Ser Gln Trp Ser Ile Glu Val 470 475480 Gly Val Asp Asp His Phe Leu Asp Leu Gln Gln Lys Thr Ser Leu 485 490495 Phe Lys Lys Val Ile Glu Pro Pro His Ser His His Arg Cys Asp 500 505510 Gly Tyr Ser Thr Ser Gln Leu Pro Ser Val Val Ser Gly Arg Leu 515 520525 Cys His Phe Phe Thr Val Leu Leu His Leu Leu Trp Gly Ser Leu 530 535540 Thr Ser Glu Asn Cys Pro Cys Tyr Ser Arg Cys Gly His Thr Lys 545 550555 Met Ser Ala Ser Val Leu His Ala Thr His Thr Val Glu Ala Val 560 565570 Ile His Arg Pro Ala Val Pro Pro Ala Glu Ile Gln Thr Arg Phe 575 580585 Cys Ser Leu Gly Asn Val Asn Thr Arg Ala Ala Pro His Ser Thr 590 595600 Ile Leu Thr Ser Pro Pro Thr Arg Ala Lys Cys Asn Arg Arg Leu 605 610615 Lys Arg Cys Ser Cys Arg Pro Cys Trp Met Glu Gln Asp Leu Phe 620 625630 Ser Ser Ala Val Pro Thr Ile Pro Phe Leu Ser His Lys Ala Tyr 635 640645 Ser Glu Asp Thr Leu Leu Ser Trp Ala Val Gly Val Cys Ile Tyr 650 655660 Asn Leu Arg Leu Leu Pro Pro Ser Leu Ile Ser Ser Trp Asn Ser 665 670675 Arg Gly Leu His Thr Leu Asn Gly Ser Ala Gln Gln Gln Glu Ser 680 685690 Lys Ala Ala Asp Arg Val Leu Ile Asn Glu Leu Leu Gly Leu Arg 695 700705 Val Asp Arg Lys Glu Asp Asn Leu Met Gln Thr Phe Phe Leu Glu 710 715720 Cys Asp Trp Ser Cys Ser Ala Cys Ser Gly Pro Leu Lys Thr Asp 725 730735 Cys Leu Gln Cys Met Asp Gly Tyr Val Leu Gln Asp Gly Ala Cys 740 745750 Val Glu Gln Cys Leu Ser Ser Phe Tyr Gln Asp Ser Gly Leu Cys 755 760765 Lys Asn Cys Asp Ser Tyr Cys Leu Gln Cys Gln Gly Pro His Glu 770 775780 Cys Thr Arg Cys Lys Gly Pro Phe Leu Leu Leu Glu Ala Gln Cys 785 790795 Val Gln Glu Cys Gly Lys Gly Tyr Phe Ala Asp His Ala Lys His 800 805810 Lys Cys Thr Ala Cys Pro Gln Gly Cys Leu Gln Cys Ser His Arg 815 820825 Asp Arg Cys His Leu Cys Asp His Gly Phe Phe Leu Lys Ser Gly 830 835840 Leu Cys Val Tyr Asn Cys Val Pro Gly Phe Ser Val His Thr Ser 845 850855 Asn Glu Thr Cys Ser Gly Lys Ile His Thr Pro Ser Leu His Val 860 865870 Asn Gly Ser Leu Ile Leu Pro Ile Gly Ser Ile Lys Pro Leu Asp 875 880885 Phe Ser Leu Leu Asn Val Gln Asp Gln Glu Gly Arg Val Lys Asp 890 895900 Leu Leu Phe His Val Val Ser Thr Pro Thr Asn Gly Gln Leu Val 905 910915 Leu Ser Arg Asn Gly Lys Glu Val Gln Leu Asp Lys Ala Gly Arg 920 925930 Phe Ser Trp Lys Asp Val Asn Glu Lys Lys Val Arg Phe Val His 935 940945 Ser Lys Glu Lys Leu Arg Lys Gly Tyr Leu Phe Leu Lys Ile Ser 950 955960 Asp Gln Gln Phe Phe Ser Glu Pro Gln Leu Ile Asn Ile Gln Ala 965 970975 Phe Ser Thr Gln Ala Pro Tyr Val Leu Arg Asn Glu Val Leu His 980 985990 Ile Ser Arg Gly Glu Arg Ala Thr Ile Thr Thr Gln Met Leu Asp 995 10001005 Ile Arg Asp Asp Asp Asn Pro Gln Asp Val Val Ile Glu Ile Ile 10101015 1020 Asp Pro Pro Leu His Gly Gln Leu Leu Gln Thr Leu Gln Ser Pro1025 1030 1035 Ala Thr Pro Ile Tyr Gln Phe Gln Leu Asp Glu Leu Ser ArgGly 1040 1045 1050 Leu Leu His Tyr Ala His Asp Gly Ser Asp Ser Thr SerAsp Val 1055 1060 1065 Ala Val Leu Gln Ala Asn Asp Gly His Ser Phe HisAsn Ile Leu 1070 1075 1080 Phe Gln Val Lys Thr Val Pro Gln Gly Gln SerTyr Asp Gly Met 1085 1090 1095 Ser Ser Leu Asp Pro Ala Pro Ser Ser ArgVal Leu Arg Ala Phe 1100 1105 1110 Ile Phe Leu Met Leu Val Pro 1115 13920 PRT Homo sapiens misc_feature Incyte ID No 7474985CD1 13 Met Ala AspSer Ser Ser Ser Ser Phe Phe Pro Asp Phe Gly Leu 1 5 10 15 Leu Leu TyrLeu Glu Glu Leu Asn Lys Glu Glu Leu Asn Thr Phe 20 25 30 Lys Leu Phe LeuLys Glu Thr Met Glu Pro Glu His Gly Leu Thr 35 40 45 Pro Trp Asn Glu ValLys Lys Ala Arg Arg Glu Asp Leu Ala Asn 50 55 60 Leu Met Lys Lys Tyr TyrPro Gly Glu Lys Ala Trp Ser Val Ser 65 70 75 Leu Lys Ile Phe Gly Lys MetAsn Leu Lys Asp Leu Cys Glu Arg 80 85 90 Ala Lys Glu Glu Ile Asn Trp SerAla Gln Thr Ile Gly Pro Asp 95 100 105 Asp Ala Lys Ala Gly Glu Thr GlnGlu Asp Gln Glu Ala Val Leu 110 115 120 Val Ile Val Asn Thr Gly Val ProAsn Ser Trp Ala Thr Asp Pro 125 130 135 Tyr Trp Ser Ala Ala Pro Arg GluSer Gly Arg Ile Ala Gly Gly 140 145 150 Asp Gly Thr Glu Tyr Arg Asn ArgIle Lys Glu Lys Phe Cys Ile 155 160 165 Thr Trp Asp Lys Lys Ser Leu AlaGly Lys Pro Glu Asp Phe His 170 175 180 His Gly Ile Ala Glu Lys Asp ArgLys Leu Leu Glu His Leu Phe 185 190 195 Asp Val Asp Val Lys Thr Gly AlaGln Pro Gln Ile Val Val Leu 200 205 210 Gln Gly Ala Ala Gly Val Gly LysThr Thr Leu Val Arg Lys Ala 215 220 225 Met Leu Asp Trp Ala Glu Gly SerLeu Tyr Gln Gln Arg Phe Lys 230 235 240 Tyr Val Phe Tyr Leu Asn Gly ArgGlu Ile Asn Gln Leu Lys Glu 245 250 255 Arg Ser Phe Ala Gln Leu Ile SerLys Asp Trp Pro Ser Thr Glu 260 265 270 Gly Pro Ile Glu Glu Ile Met TyrGln Pro Ser Ser Leu Leu Phe 275 280 285 Ile Ile Asp Ser Phe Asp Glu LeuAsn Phe Ala Phe Glu Glu Pro 290 295 300 Glu Phe Ala Leu Cys Glu Asp TrpThr Gln Glu His Pro Val Ser 305 310 315 Phe Leu Met Ser Ser Leu Leu ArgLys Val Met Leu Pro Glu Ala 320 325 330 Ser Leu Leu Val Thr Thr Arg LeuThr Thr Ser Lys Arg Leu Lys 335 340 345 Gln Leu Leu Lys Asn His His TyrVal Glu Leu Leu Gly Met Ser 350 355 360 Glu Asp Ala Arg Glu Glu Tyr IleTyr Gln Phe Phe Glu Asp Lys 365 370 375 Arg Trp Ala Met Lys Val Phe SerSer Leu Lys Ser Asn Glu Met 380 385 390 Leu Phe Ser Met Cys Gln Val ProLeu Val Cys Trp Ala Ala Cys 395 400 405 Thr Cys Leu Lys Gln Gln Met GluLys Gly Gly Asp Val Thr Leu 410 415 420 Thr Cys Gln Thr Thr Thr Ala LeuPhe Thr Cys Tyr Ile Ser Ser 425 430 435 Leu Phe Thr Pro Val Asp Gly GlySer Pro Ser Leu Pro Asn Gln 440 445 450 Ala Gln Leu Arg Arg Leu Cys GlnVal Ala Ala Lys Gly Ile Trp 455 460 465 Thr Met Thr Tyr Val Phe Tyr ArgGlu Asn Leu Arg Arg Leu Gly 470 475 480 Leu Thr Gln Ser Asp Val Ser SerPhe Met Asp Ser Asn Ile Ile 485 490 495 Gln Lys Asp Ala Glu Tyr Glu AsnCys Tyr Val Phe Thr His Leu 500 505 510 His Val Gln Glu Phe Phe Ala AlaMet Phe Tyr Met Leu Lys Gly 515 520 525 Ser Trp Glu Ala Gly Asn Pro SerCys Gln Pro Phe Glu Asp Leu 530 535 540 Lys Ser Leu Leu Gln Ser Thr SerTyr Lys Asp Pro His Leu Thr 545 550 555 Gln Met Lys Cys Phe Leu Phe GlyLeu Leu Asn Glu Asp Arg Val 560 565 570 Lys Gln Leu Glu Arg Thr Phe AsnCys Lys Met Ser Leu Lys Ile 575 580 585 Lys Ser Lys Leu Leu Gln Cys MetGlu Val Leu Gly Asn Ser Asp 590 595 600 Tyr Ser Pro Ser Gln Leu Gly PheLeu Glu Leu Phe His Cys Leu 605 610 615 Tyr Glu Thr Gln Asp Lys Ala PheIle Ser Gln Ala Met Arg Cys 620 625 630 Phe Pro Lys Val Ala Ile Asn IleCys Glu Lys Ile His Leu Leu 635 640 645 Val Ser Ser Phe Cys Leu Lys HisCys Arg Cys Leu Arg Thr Ile 650 655 660 Arg Leu Ser Val Thr Val Val PheGlu Lys Lys Ile Leu Lys Thr 665 670 675 Ser Leu Pro Thr Asn Thr Trp LeuGlu Tyr Cys Gly Leu Thr Ser 680 685 690 Leu Cys Cys Gln Asp Leu Ser SerAla Leu Ile Cys Asn Lys Arg 695 700 705 Leu Ile Lys Met Asn Leu Thr GlnAsn Thr Leu Gly Tyr Glu Gly 710 715 720 Ile Val Lys Leu Tyr Lys Val LeuLys Ser Pro Lys Cys Lys Leu 725 730 735 Gln Val Leu Gly Leu Glu Ser CysGly Leu Thr Glu Ala Gly Cys 740 745 750 Glu Tyr Leu Ser Leu Ala Leu IleSer Asn Lys Arg Leu Thr His 755 760 765 Leu Cys Leu Ala Asp Asn Val LeuGly Asp Gly Gly Val Lys Leu 770 775 780 Met Ser Asp Ala Leu Gln His AlaGln Cys Thr Leu Lys Ser Leu 785 790 795 Val Leu Met Gly Cys Val Leu ThrAsn Ala Cys Cys Leu Asp Leu 800 805 810 Ala Ser Val Ile Leu Asn Asn ProAsn Leu Arg Ser Leu Asp Leu 815 820 825 Gly Asn Asn Asp Leu Gln Asp AspGly Val Lys Ile Leu Cys Asp 830 835 840 Ala Leu Arg Tyr Pro Asn Cys AsnIle Gln Arg Leu Gly Leu Leu 845 850 855 Ile Thr Asn Thr Ile Leu Glu AsnGly Pro Glu Leu Gln Leu Ser 860 865 870 Asp Gln Pro Arg Arg Leu Arg LysLys Ser Met Glu Met Val Leu 875 880 885 Phe Gln Gly Ala Cys Trp Leu ProHis Thr Ile Asp Leu Ser Gly 890 895 900 Asp Ser Ser Leu Ala Asp Glu SerGlu Arg Ile Ser Ser His Thr 905 910 915 Val Leu His Ile Val 920 14 340PRT Homo sapiens misc_feature Incyte ID No 7475137CD1 14 Met Gly Arg AsnGln Ser Gly Lys Ala Glu Asn Ser Lys Asn Gln 1 5 10 15 Ser Ala Ser SerPro Pro Lys Asp Cys Asn Ser Pro Pro Ala Met 20 25 30 Glu Gln Ser Trp IleGlu Asn Asp Phe Asp Glu Leu Thr Glu Val 35 40 45 Gly Phe Arg Arg Ser ValIle Thr Asn Tyr Ser Ser Glu Leu Lys 50 55 60 Glu His Val Gln Thr His CysLys Glu Ala Lys Asn Leu Glu Lys 65 70 75 Trp Leu Asp Glu Trp Leu Asn ArgIle Asn Ser Val Glu Lys Thr 80 85 90 Leu Asn Asp Leu Lys Glu Leu Lys ThrMet Ala Gln Glu Leu Arg 95 100 105 Glu Ala Cys Thr Ser Phe Asn Ser GlnPhe Asn Gln Val Glu Glu 110 115 120 Arg Val Ser Val Ile Glu Asp Gln IleAsn Glu Ile Lys Arg Glu 125 130 135 Asp Met Val Arg Glu Lys Arg Leu LysArg Asn Lys Gln Ser Leu 140 145 150 Gln Glu Ile Trp Tyr Tyr Val Lys ArgPro Asn Leu His Leu Ile 155 160 165 Ser Val Pro Glu Thr Asp Gly Ala AsnArg Thr Lys Leu Glu His 170 175 180 Thr Leu His Asp Ile Ile Gln Asn LeuPro Lys Ile Ala Arg Gln 185 190 195 Ala Asn Thr Gln Ile Gln Glu Ile GlnArg Thr Pro Gln Arg Tyr 200 205 210 Ser Ser Arg Ile Ala Thr Pro Arg HisVal Val Arg Phe Thr Lys 215 220 225 Val Lys Met Lys Glu Lys Met Leu ThrAla Ala Arg Glu Lys Gly 230 235 240 Gln Val Thr His Lys Gly Lys Pro IleArg Leu Thr Ala Asp Leu 245 250 255 Leu Ala Glu Thr Leu Gln Ala Arg ArgGln Trp Gly Pro Ile Phe 260 265 270 Asn Ile Leu Lys Glu Lys Asn Phe GlnSer Arg Ile Pro Tyr Pro 275 280 285 Ala Lys Leu Ser Phe Ile Ser Glu GlyGlu Ile Lys Ser Phe Thr 290 295 300 Glu Lys Gln Met Leu Arg Asp Phe ValThr Thr Arg Pro Ala Leu 305 310 315 Gln Glu Leu Leu Lys Glu Ala Leu AsnMet Glu Arg Asn Asn Gln 320 325 330 Tyr Gln Pro Leu Gln Lys His Ala LysLeu 335 340 15 577 PRT Homo sapiens misc_feature Incyte ID No 5036986CD115 Met His Trp Leu Arg Lys Val Gln Gly Leu Cys Thr Leu Trp Gly 1 5 10 15Thr Gln Met Ser Ser Arg Thr Leu Tyr Ile Asn Ser Arg Gln Leu 20 25 30 ValSer Leu Gln Trp Gly His Gln Glu Val Pro Ala Lys Phe Asn 35 40 45 Phe AlaSer Asp Val Leu Asp His Trp Ala Asp Met Glu Lys Ala 50 55 60 Gly Lys ArgLeu Pro Ser Pro Ala Leu Trp Trp Val Asn Gly Lys 65 70 75 Gly Lys Glu LeuMet Trp Asn Phe Arg Glu Leu Ser Glu Asn Ser 80 85 90 Gln Gln Ala Ala AsnVal Leu Ser Gly Ala Cys Gly Leu Gln Arg 95 100 105 Gly Asp Arg Val AlaVal Met Leu Pro Arg Val Pro Glu Trp Trp 110 115 120 Leu Val Ile Leu GlyCys Ile Arg Ala Gly Leu Ile Phe Met Pro 125 130 135 Gly Thr Ile Gln MetLys Ser Thr Asp Ile Leu Tyr Arg Leu Gln 140 145 150 Met Ser Lys Ala LysAla Ile Val Ala Gly Asp Glu Val Ile Gln 155 160 165 Glu Val Asp Thr ValAla Ser Glu Cys Pro Ser Leu Arg Ile Lys 170 175 180 Leu Leu Val Ser GluLys Ser Cys Asp Gly Trp Leu Asn Phe Lys 185 190 195 Lys Leu Leu Asn GluAla Ser Thr Thr His His Cys Val Glu Thr 200 205 210 Gly Ser Gln Glu AlaSer Ala Ile Tyr Phe Thr Ser Gly Thr Ser 215 220 225 Gly Leu Pro Lys MetAla Glu His Ser Tyr Ser Ser Leu Gly Leu 230 235 240 Lys Ala Lys Met AspAla Gly Trp Thr Gly Leu Gln Ala Ser Asp 245 250 255 Ile Met Trp Thr IleSer Asp Thr Gly Trp Ile Leu Asn Ile Leu 260 265 270 Gly Ser Leu Leu GluSer Trp Thr Leu Gly Ala Cys Thr Phe Val 275 280 285 His Leu Leu Pro LysPhe Asp Pro Leu Val Ile Leu Lys Thr Leu 290 295 300 Ser Ser Tyr Pro IleLys Ser Met Met Gly Ala Pro Ile Val Tyr 305 310 315 Arg Met Leu Leu GlnGln Asp Leu Ser Ser Tyr Lys Phe Pro His 320 325 330 Leu Gln Asn Cys LeuAla Gly Gly Glu Ser Leu Leu Pro Glu Thr 335 340 345 Leu Glu Asn Trp ArgAla Gln Thr Gly Leu Asp Ile Arg Glu Phe 350 355 360 Tyr Gly Gln Thr GluThr Gly Leu Thr Cys Met Val Ser Lys Thr 365 370 375 Met Lys Ile Lys ProGly Tyr Met Gly Thr Ala Ala Ser Cys Tyr 380 385 390 Asp Val Gln Val IleAsp Asp Lys Gly Asn Val Leu Pro Pro Gly 395 400 405 Thr Glu Gly Asp IleGly Ile Arg Val Lys Pro Ile Arg Pro Ile 410 415 420 Gly Ile Phe Ser GlyTyr Val Glu Asn Pro Asp Lys Thr Ala Ala 425 430 435 Asn Ile Arg Gly AspPhe Trp Leu Leu Gly Asp Arg Gly Ile Lys 440 445 450 Asp Glu Asp Gly TyrPhe Gln Phe Met Gly Arg Ala Asp Asp Ile 455 460 465 Ile Asn Ser Ser GlyTyr Arg Ile Gly Pro Ser Glu Val Glu Asn 470 475 480 Ala Leu Met Lys HisPro Ala Val Val Glu Thr Ala Val Ile Ser 485 490 495 Ser Pro Asp Pro ValArg Gly Glu Val Val Lys Ala Phe Val Ile 500 505 510 Leu Ala Ser Gln PheLeu Ser His Asp Pro Glu Gln Leu Thr Lys 515 520 525 Glu Leu Gln Gln HisVal Lys Ser Val Thr Ala Pro Tyr Lys Tyr 530 535 540 Pro Arg Lys Ile GluPhe Val Leu Asn Leu Pro Lys Thr Val Thr 545 550 555 Gly Lys Ile Gln ArgThr Lys Leu Arg Asp Lys Glu Trp Lys Met 560 565 570 Ser Gly Lys Ala ArgAla Gln 575 16 119 PRT Homo sapiens misc_feature Incyte ID No 1375644CD116 Met Val Arg Ile Ser Lys Pro Lys Thr Phe Gln Ala Tyr Leu Asp 1 5 10 15Asp Cys His Arg Arg Tyr Ser Cys Ala His Cys Arg Ala His Leu 20 25 30 AlaAsn His Asp Asp Leu Ile Ser Lys Ser Phe Gln Gly Ser Gln 35 40 45 Gly ArgAla Tyr Leu Phe Asn Ser Val Val Asn Val Gly Cys Gly 50 55 60 Pro Ala GluGlu Arg Val Leu Leu Thr Gly Leu His Ala Val Ala 65 70 75 Asp Ile His CysGlu Asn Cys Lys Thr Thr Leu Gly Trp Lys Tyr 80 85 90 Glu Gln Ala Phe GluSer Ser Gln Lys Tyr Lys Glu Gly Lys Tyr 95 100 105 Ile Ile Glu Leu AsnHis Met Ile Lys Asp Asn Gly Trp Asp 110 115 17 389 PRT Homo sapiensmisc_feature Incyte ID No 1356261CD1 17 Met Ser Val Arg Thr Leu Pro LeuLeu Phe Leu Asn Leu Gly Gly 1 5 10 15 Glu Met Leu Tyr Ile Leu Asp GlnArg Leu Arg Ala Gln Asn Ile 20 25 30 Pro Gly Asp Lys Ala Arg Lys Asp GluTrp Thr Glu Val Asp Arg 35 40 45 Lys Arg Val Leu Asn Asp Ile Ile Ser ThrMet Phe Asn Arg Lys 50 55 60 Phe Met Glu Glu Leu Phe Lys Pro Gln Glu LeuTyr Ser Lys Lys 65 70 75 Ala Leu Arg Thr Val Tyr Glu Arg Leu Ala His AlaSer Ile Met 80 85 90 Lys Leu Asn Gln Ala Ser Met Asp Lys Leu Tyr Asp LeuMet Thr 95 100 105 Met Ala Phe Lys Tyr Gln Val Leu Leu Cys Pro Arg ProLys Asp 110 115 120 Val Leu Leu Val Thr Phe Asn His Leu Asp Thr Ile LysGly Phe 125 130 135 Ile Arg Asp Ser Pro Ala Ile Leu Gln Gln Val Asp GluThr Leu 140 145 150 Arg Gln Leu Thr Glu Ile Tyr Gly Gly Leu Ser Ala GlyGlu Phe 155 160 165 Gln Leu Ile Arg Gln Thr Leu Leu Ile Phe Phe Gln AspLeu His 170 175 180 Ile Arg Val Ser Met Phe Leu Lys Asp Lys Val Gln AsnAsn Asn 185 190 195 Gly Arg Phe Val Leu Pro Val Ser Gly Pro Val Pro TrpGly Thr 200 205 210 Glu Val Pro Gly Leu Ile Arg Met Phe Asn Asn Lys GlyGlu Glu 215 220 225 Val Lys Arg Ile Glu Phe Lys His Gly Gly Asn Tyr ValPro Ala 230 235 240 Pro Lys Glu Gly Ser Phe Glu Leu Tyr Gly Asp Arg ValLeu Lys 245 250 255 Leu Gly Thr Asn Met Tyr Ser Val Asn Gln Pro Val GluThr His 260 265 270 Val Ser Gly Ser Ser Lys Asn Leu Ala Ser Trp Thr GlnGlu Ser 275 280 285 Ile Ala Pro Asn Pro Leu Ala Lys Glu Glu Leu Asn PheLeu Ala 290 295 300 Arg Leu Met Gly Gly Met Glu Ile Lys Lys Pro Ser GlyPro Glu 305 310 315 Pro Arg Phe Arg Leu Asn Leu Phe Thr Thr Asp Glu GluGlu Glu 320 325 330 Gln Ala Ala Leu Thr Arg Pro Glu Glu Leu Ser Tyr GluVal Ile 335 340 345 Asn Ile Gln Ala Thr Gln Asp Gln Gln Arg Ser Glu GluLeu Ala 350 355 360 Arg Ile Met Gly Glu Phe Glu Ile Thr Glu Gln Pro ArgLeu Ser 365 370 375 Thr Ser Lys Gly Asp Asp Leu Leu Ala Met Met Asp GluLeu 380 385 18 266 PRT Homo sapiens misc_feature Incyte ID No 1419379CD118 Met Val Thr Glu Gln Glu Val Asp Ala Ile Gly Gln Thr Leu Val 1 5 10 15Asp Pro Lys Gln Pro Leu Gln Ala Arg Phe Arg Ala Leu Phe Thr 20 25 30 LeuArg Gly Leu Gly Gly Pro Gly Ala Ile Ala Trp Ile Ser Gln 35 40 45 Ala PheAsp Asp Asp Ser Ala Leu Leu Lys His Glu Leu Ala Tyr 50 55 60 Cys Leu GlyGln Met Gln Asp Ala Arg Ala Ile Pro Met Leu Val 65 70 75 Asp Val Leu GlnAsp Thr Arg Gln Glu Pro Met Val Arg His Glu 80 85 90 Ala Gly Glu Ala LeuGly Ala Ile Gly Asp Pro Glu Val Leu Glu 95 100 105 Ile Leu Lys Gln TyrSer Ser Asp Pro Val Ile Glu Val Ala Glu 110 115 120 Thr Cys Gln Leu AlaVal Arg Arg Leu Glu Trp Leu Gln Gln His 125 130 135 Gly Gly Glu Pro AlaAla Gly Pro Tyr Leu Ser Val Asp Pro Ala 140 145 150 Pro Pro Ala Glu GluArg Asp Val Gly Arg Leu Arg Glu Ala Leu 155 160 165 Leu Asp Glu Ser ArgPro Leu Phe Glu Arg Tyr Arg Ala Met Phe 170 175 180 Ala Leu Arg Asn AlaGly Gly Glu Glu Ala Ala Leu Ala Leu Ala 185 190 195 Glu Gly Leu His CysGly Ser Ala Leu Phe Arg His Glu Val Gly 200 205 210 Tyr Val Leu Gly GlnLeu Gln His Glu Ala Glu Leu Gly Leu Trp 215 220 225 Glu Glu Ala Asp ValMet Val Ala Ser Ala Ser Leu Gly Met Gly 230 235 240 His Arg Val Trp GlyPro Leu Thr Pro Pro Pro Arg Gly Leu Gly 245 250 255 Arg Asp Arg Val ValVal Pro Glu Trp Val Arg 260 265 19 661 PRT Homo sapiens misc_featureIncyte ID No 3509272CD1 19 Met His Pro Ile Pro Ser Ser Phe Met Ile LysAla Val Ser Ser 1 5 10 15 Phe Leu Thr Ala Glu Glu Ala Ser Val Gly AsnPro Glu Gly Ala 20 25 30 Phe Met Lys Val Leu Gln Ala Arg Lys Asn Tyr ThrSer Thr Glu 35 40 45 Leu Ile Val Glu Pro Glu Glu Pro Ser Asp Ser Ser GlyIle Asn 50 55 60 Leu Ser Gly Phe Gly Ser Glu Gln Leu Asp Thr Asn Asp GluSer 65 70 75 Asp Phe Ile Ser Thr Leu Ser Tyr Ile Leu Pro Tyr Phe Ser Ala80 85 90 Val Asn Leu Asp Val Lys Ser Leu Leu Leu Pro Leu Ile Lys Leu 95100 105 Pro Thr Thr Gly Asn Ser Leu Ala Lys Ile Gln Thr Val Gly Gln 110115 120 Asn Arg Gln Arg Val Lys Arg Val Leu Met Gly Pro Arg Ser Ile 125130 135 Gln Lys Arg His Phe Lys Glu Val Gly Arg Gln Ser Ile Arg Arg 140145 150 Glu Gln Gly Ala Gln Ala Ser Val Glu Asn Ala Ala Glu Glu Lys 155160 165 Arg Leu Gly Ser Pro Ala Pro Arg Glu Val Glu Gln Pro His Thr 170175 180 Gln Gln Gly Pro Glu Lys Leu Ala Gly Asn Ala Val Tyr Thr Lys 185190 195 Pro Ser Phe Thr Gln Glu His Lys Ala Ala Val Ser Val Leu Lys 200205 210 Pro Phe Ser Lys Gly Ala Pro Ser Thr Ser Ser Pro Ala Lys Ala 215220 225 Leu Pro Gln Val Arg Asp Arg Trp Lys Asp Leu Thr His Ala Ile 230235 240 Ser Ile Leu Glu Ser Ala Lys Ala Arg Val Thr Asn Thr Lys Thr 245250 255 Ser Lys Pro Ile Val His Ala Arg Lys Lys Tyr Arg Phe His Lys 260265 270 Thr Arg Ser His Val Thr His Arg Thr Pro Lys Val Lys Lys Ser 275280 285 Pro Lys Val Arg Lys Lys Ser Tyr Leu Ser Arg Leu Met Leu Ala 290295 300 Asn Arg Leu Pro Phe Ser Ala Ala Lys Ser Leu Ile Asn Ser Pro 305310 315 Ser Gln Gly Ala Phe Ser Ser Leu Gly Asp Leu Ser Pro Gln Glu 320325 330 Asn Pro Phe Leu Glu Val Ser Ala Pro Ser Glu His Phe Ile Glu 335340 345 Lys Asn Asn Thr Lys His Thr Thr Ala Arg Asn Ala Phe Glu Glu 350355 360 Asn Asp Phe Met Glu Asn Thr Asn Met Pro Glu Gly Thr Ile Ser 365370 375 Glu Asn Thr Asn Tyr Asn His Pro Pro Glu Ala Asp Ser Ala Gly 380385 390 Thr Ala Phe Asn Leu Gly Pro Thr Val Lys Gln Thr Glu Thr Lys 395400 405 Trp Glu Tyr Asn Asn Val Gly Thr Asp Leu Ser Pro Glu Pro Lys 410415 420 Ser Phe Asn Tyr Pro Leu Leu Ser Ser Pro Gly Asp Gln Phe Glu 425430 435 Ile Gln Leu Thr Gln Gln Leu Gln Ser Leu Ile Pro Asn Asn Asn 440445 450 Val Arg Arg Leu Ile Ala His Val Ile Arg Thr Leu Lys Met Asp 455460 465 Cys Ser Gly Ala His Val Gln Val Thr Cys Ala Lys Leu Ile Ser 470475 480 Arg Thr Gly His Leu Met Lys Leu Leu Ser Gly Gln Gln Glu Val 485490 495 Lys Ala Ser Lys Ile Glu Trp Asp Thr Asp Gln Trp Lys Ile Glu 500505 510 Asn Tyr Ile Asn Glu Ser Thr Glu Ala Gln Ser Glu Gln Lys Glu 515520 525 Lys Ser Leu Glu Leu Lys Lys Glu Val Pro Gly Tyr Gly Tyr Thr 530535 540 Asp Lys Leu Ile Leu Ala Leu Ile Val Thr Gly Ile Leu Thr Ile 545550 555 Leu Ile Ile Leu Phe Cys Leu Ile Val Ile Cys Cys His Arg Arg 560565 570 Ser Leu Gln Glu Asp Glu Glu Gly Phe Ser Arg Gly Ile Phe Arg 575580 585 Phe Leu Pro Arg Arg Gly Cys Ser Ser Arg Arg Glu Ser Gln Asp 590595 600 Gly Leu Ser Ser Phe Gly Gln Pro Leu Trp Phe Lys Asp Met Tyr 605610 615 Lys Pro Leu Ser Ala Thr Arg Ile Asn Asn His Ala Trp Lys Leu 620625 630 His Lys Lys Ser Ser Asn Glu Asp Lys Ile Leu Asn Arg Asp Pro 635640 645 Gly Asp Ser Glu Ala Pro Thr Glu Glu Glu Glu Ser Glu Ala Leu 650655 660 Pro 20 205 PRT Homo sapiens misc_feature Incyte ID No 708425CD120 Met Thr Asn Asp Glu Gln Val Glu Tyr Ile Glu Tyr Leu Ser Arg 1 5 10 15Lys Val Ser Thr Glu Met Gly Leu Arg Glu Gln Leu Asp Ile Ile 20 25 30 LysIle Ile Asp Pro Ser Ala Gln Ile Ser Pro Thr Asp Ser Glu 35 40 45 Phe IleIle Glu Leu Asn Cys Leu Thr Asp Glu Lys Leu Lys Gln 50 55 60 Val Arg AsnTyr Ile Lys Glu His Ser Pro Arg Gln Arg Pro Ala 65 70 75 Arg Glu Ala TrpLys Arg Ser Asn Phe Ser Cys Ala Ser Thr Ser 80 85 90 Gly Val Ser Gly AlaSer Ala Ser Ala Ser Ser Ser Ser Ala Ser 95 100 105 Met Val Ser Ser AlaSer Ser Ser Gly Ser Ser Val Gly Asn Ser 110 115 120 Ala Ser Asn Ser SerAla Asn Met Ser Arg Ala His Ser Asp Ser 125 130 135 Asn Leu Ser Ala SerAla Ala Glu Arg Ile Arg Asp Ser Lys Lys 140 145 150 Arg Ser Lys Gln ArgLys Leu Gln Gln Lys Ala Phe Arg Lys Arg 155 160 165 Gln Leu Lys Glu GlnArg Gln Ala Arg Lys Glu Arg Leu Ser Gly 170 175 180 Leu Phe Leu Asn GluGlu Val Leu Ser Leu Lys Val Thr Glu Glu 185 190 195 Asp His Glu Ala AspVal Asp Val Leu Met 200 205 21 833 PRT Homo sapiens misc_feature IncyteID No 7469966CD1 21 Met His Val Leu Trp Asp Leu Lys Lys Asn Phe Arg CysAla Val 1 5 10 15 Leu Lys Asn Lys Thr Thr Asp Phe Ala Glu Ile Ala GluGln Val 20 25 30 Ile Asn Leu Val Thr Tyr Arg Ala Lys Ser His Gln Asp TyrIle 35 40 45 Pro Val Leu Leu Leu Val Asp Asp Phe Glu Glu Gln Glu Asn Val50 55 60 Tyr Phe Leu Gln Asn Ala Ile His Ser Val Leu Ala Glu Lys Asp 6570 75 Leu Arg Tyr Glu Lys Thr Leu Val Ile Ile Leu Asn Cys Met Arg 80 8590 Ser Arg Asn Pro Asp Glu Ser Ala Lys Leu Ala Asp Ser Ile Ala 95 100105 Leu Asn Tyr Gln Leu Ser Ser Lys Glu Gln Arg Ala Phe Gly Ala 110 115120 Lys Leu Lys Glu Ile Glu Lys Gln His Lys Asn Cys Glu Asn Phe 125 130135 Tyr Ser Phe Met Ile Met Lys Ser Asn Phe Asp Glu Thr Tyr Ile 140 145150 Glu Asn Val Val Arg Asn Ile Leu Lys Gly Gln Asp Val Asp Ser 155 160165 Lys Glu Ala Gln Leu Ile Ser Phe Leu Ala Leu Leu Ser Ser Tyr 170 175180 Val Thr Asp Ser Thr Ile Ser Val Ser Gln Cys Glu Ile Phe Leu 185 190195 Gly Ile Ile Tyr Thr Ser Thr Pro Trp Glu Pro Glu Ser Leu Glu 200 205210 Asp Lys Met Gly Thr Tyr Ser Thr Leu Leu Ile Lys Thr Glu Val 215 220225 Ala Glu Tyr Gly Arg Tyr Thr Gly Val Arg Ile Ile His Pro Leu 230 235240 Ile Ala Leu Tyr Cys Leu Lys Glu Leu Glu Arg Ser Tyr His Leu 245 250255 Asp Lys Cys Gln Ile Ala Leu Asn Ile Leu Glu Glu Asn Leu Phe 260 265270 Tyr Asp Ser Gly Ile Gly Arg Asp Lys Phe Gln His Asp Val Gln 275 280285 Thr Leu Leu Leu Thr Arg Gln Arg Lys Val Tyr Gly Asp Glu Thr 290 295300 Asp Thr Leu Phe Ser Pro Leu Met Glu Ala Leu Gln Asn Lys Asp 305 310315 Ile Glu Lys Val Leu Ser Ala Gly Ser Arg Arg Phe Pro Gln Asn 320 325330 Ala Phe Ile Cys Gln Ala Leu Ala Arg His Phe Tyr Ile Lys Glu 335 340345 Lys Asp Phe Asn Thr Ala Leu Asp Trp Ala Arg Gln Ala Lys Met 350 355360 Lys Ala Pro Lys Asn Ser Tyr Ile Ser Asp Thr Leu Gly Gln Val 365 370375 Tyr Lys Ser Glu Ile Lys Trp Trp Leu Asp Gly Asn Lys Asn Cys 380 385390 Arg Ser Ile Thr Val Asn Asp Leu Thr His Leu Leu Glu Ala Ala 395 400405 Glu Lys Ala Ser Arg Ala Phe Lys Glu Ser Gln Arg Gln Thr Asp 410 415420 Ser Lys Asn Tyr Glu Thr Glu Asn Trp Ser Pro Gln Lys Ser Gln 425 430435 Arg Arg Tyr Asp Met Tyr Asn Thr Ala Cys Phe Leu Gly Glu Ile 440 445450 Glu Val Gly Leu Tyr Thr Ile Gln Ile Leu Gln Leu Thr Pro Phe 455 460465 Phe His Lys Glu Asn Glu Leu Ser Lys Lys His Met Val Gln Phe 470 475480 Leu Ser Gly Lys Trp Thr Ile Pro Pro Asp Pro Arg Asn Glu Cys 485 490495 Tyr Leu Ala Leu Ser Lys Phe Thr Ser His Leu Lys Asn Leu Gln 500 505510 Ser Asp Leu Lys Arg Cys Phe Asp Phe Phe Ile Asp Tyr Met Val 515 520525 Leu Leu Lys Met Arg Tyr Thr Gln Lys Glu Ile Ala Glu Ile Met 530 535540 Leu Ser Lys Lys Val Ser Arg Cys Phe Arg Lys Tyr Thr Glu Leu 545 550555 Phe Cys His Leu Asp Pro Cys Leu Leu Gln Ser Lys Glu Ser Gln 560 565570 Leu Leu Gln Glu Glu Asn Cys Arg Lys Lys Leu Glu Ala Leu Arg 575 580585 Ala Asp Arg Phe Ala Gly Leu Leu Glu Tyr Leu Asn Pro Asn Tyr 590 595600 Lys Asp Ala Thr Thr Met Glu Ser Ile Val Asn Glu Tyr Ala Phe 605 610615 Leu Leu Gln Gln Asn Ser Lys Lys Pro Met Thr Asn Glu Lys Gln 620 625630 Asn Ser Ile Leu Ala Asn Ile Ile Leu Ser Cys Leu Lys Pro Asn 635 640645 Ser Lys Leu Ile Gln Pro Leu Thr Thr Leu Lys Lys Gln Leu Arg 650 655660 Glu Val Leu Gln Phe Val Gly Leu Ser His Gln Tyr Pro Gly Pro 665 670675 Tyr Phe Leu Ala Cys Leu Leu Phe Trp Pro Glu Asn Gln Glu Leu 680 685690 Asp Gln Asp Ser Lys Leu Ile Glu Lys Tyr Val Ser Ser Leu Asn 695 700705 Arg Ser Phe Arg Gly Gln Tyr Lys Arg Met Cys Arg Ser Lys Gln 710 715720 Ala Ser Thr Leu Phe Tyr Leu Gly Lys Arg Lys Gly Leu Asn Ser 725 730735 Ile Val His Lys Ala Lys Ile Glu Gln Tyr Phe Asp Lys Ala Gln 740 745750 Asn Thr Asn Ser Leu Trp His Ser Gly Asp Val Trp Lys Lys Asn 755 760765 Glu Val Lys Asp Leu Leu Arg Arg Leu Thr Gly Gln Ala Glu Gly 770 775780 Lys Leu Ile Ser Val Glu Tyr Gly Thr Glu Glu Lys Ile Lys Ile 785 790795 Pro Val Ile Ser Val Tyr Ser Gly Pro Leu Arg Ser Gly Arg Asn 800 805810 Ile Glu Arg Val Ser Phe Tyr Leu Gly Phe Ser Ile Glu Gly Pro 815 820825 Leu Ala Tyr Asp Ile Glu Val Ile 830 22 655 PRT Homo sapiensmisc_feature Incyte ID No 6920129CD1 22 Met Ser Gln Arg Asp Gly Val CysGly Ser His Glu Val Ala Gly 1 5 10 15 Ala Ala Ser Pro Gly Ala Asp GlyGly Leu Ser Leu Ala Ala Tyr 20 25 30 Cys Lys Asn Ser Val Asp Gly Leu TrpTyr Cys Phe Asp Asp Ser 35 40 45 Asp Val Gln Gln Leu Ser Glu Asp Glu ValCys Thr Gln Thr Ala 50 55 60 Tyr Ile Leu Phe Tyr Gln Arg Arg Thr Ala IlePro Ser Trp Ser 65 70 75 Ala Asn Ser Ser Val Ala Gly Ser Thr Ser Ser SerLeu Cys Glu 80 85 90 His Trp Val Ser Arg Leu Pro Gly Ser Lys Pro Ala SerVal Thr 95 100 105 Ser Ala Ala Ser Ser Arg Arg Thr Ser Leu Ala Ser LeuSer Glu 110 115 120 Ser Val Glu Met Thr Gly Glu Arg Ser Glu Asp Asp GlyGly Phe 125 130 135 Ser Thr Arg Pro Phe Val Arg Ser Val Gln Arg Gln SerLeu Ser 140 145 150 Ser Arg Ser Ser Val Thr Ser Pro Leu Ala Val Asn GluAsn Cys 155 160 165 Met Arg Pro Ser Trp Ser Leu Ser Ala Lys Leu Gln MetArg Ser 170 175 180 Asn Ser Pro Ser Arg Phe Ser Gly Asp Ser Pro Ile HisSer Ser 185 190 195 Ala Ser Thr Leu Glu Lys Ile Gly Glu Ala Ala Asp AspLys Val 200 205 210 Ser Ile Ser Cys Phe Gly Ser Leu Arg Asn Leu Ser SerSer Tyr 215 220 225 Gln Glu Pro Ser Asp Ser His Ser Leu Arg Glu His LysAla Val 230 235 240 Gly Arg Ala Pro Leu Ala Val Met Glu Gly Val Phe LysAsp Glu 245 250 255 Ser Asp Thr Arg Arg Leu Asn Ser Ser Val Val Asp ThrGln Ser 260 265 270 Lys His Ser Ala Gln Gly Asp Arg Leu Pro Pro Leu SerGly Pro 275 280 285 Phe Asp Asn Asn Asn Gln Ile Ala Tyr Val Asp Gln SerAsp Ser 290 295 300 Val Asp Ser Ser Pro Val Lys Glu Val Lys Ala Pro SerHis Pro 305 310 315 Gly Ser Leu Ala Lys Lys Pro Glu Ser Thr Thr Lys ArgSer Pro 320 325 330 Ser Ser Lys Gly Thr Ser Glu Pro Glu Lys Ser Leu ArgLys Gly 335 340 345 Arg Pro Ala Leu Ala Ser Gln Glu Ser Ser Leu Ser SerThr Ser 350 355 360 Pro Ser Ser Pro Leu Pro Val Lys Val Ser Leu Lys ProSer Arg 365 370 375 Ser Arg Ser Lys Ala Asp Ser Ser Ser Arg Gly Ser GlyArg His 380 385 390 Ser Ser Pro Ala Pro Ala Gln Pro Lys Lys Glu Ser SerPro Lys 395 400 405 Ser Gln Asp Ser Val Ser Ser Pro Ser Pro Gln Lys GlnLys Ser 410 415 420 Ala Ser Ala Leu Thr Tyr Thr Ala Ser Ser Thr Ser AlaLys Lys 425 430 435 Ala Ser Gly Pro Ala Thr Arg Ser Pro Phe Pro Pro GlyLys Ser 440 445 450 Arg Thr Ser Asp His Ser Leu Ser Arg Glu Gly Ser ArgGln Ser 455 460 465 Leu Gly Ser Asp Arg Ala Ser Ala Thr Ser Thr Ser LysPro Asn 470 475 480 Ser Pro Arg Val Ser Gln Ala Arg Ala Gly Glu Gly ArgGly Ala 485 490 495 Gly Lys His Val Arg Ser Ser Ser Met Ala Ser Leu ArgSer Pro 500 505 510 Ser Thr Ser Ile Lys Ser Gly Leu Lys Arg Asp Ser LysSer Glu 515 520 525 Asp Lys Gly Leu Ser Phe Phe Lys Ser Ala Leu Arg GlnLys Glu 530 535 540 Thr Arg Arg Ser Thr Asp Leu Gly Lys Thr Ala Leu LeuSer Lys 545 550 555 Lys Ala Gly Gly Ser Ser Val Lys Ser Val Cys Lys AsnThr Gly 560 565 570 Asp Asp Glu Ala Glu Arg Gly His Gln Pro Pro Ala SerGln Gln 575 580 585 Pro Asn Ala Asn Thr Thr Gly Lys Glu Gln Leu Val ThrLys Asp 590 595 600 Pro Ala Ser Ala Lys His Ser Leu Leu Ser Ala Arg LysSer Lys 605 610 615 Ser Ser Gln Leu Asp Ser Gly Val Pro Ser Ser Pro GlyGly Arg 620 625 630 Gln Ser Ala Glu Lys Ser Ser Lys Lys Leu Ser Ser SerMet Gln 635 640 645 Thr Ser Ala Arg Pro Ser Gln Lys Pro Gln 650 655 23330 PRT Homo sapiens misc_feature Incyte ID No 71514704CD1 23 Met GlySer Arg Lys Lys Glu Ile Ala Leu Gln Val Asn Ile Ser 1 5 10 15 Thr GlnGlu Leu Trp Glu Glu Met Leu Ser Ser Lys Gly Leu Thr 20 25 30 Val Val AspVal Tyr Gln Gly Trp Cys Gly Pro Cys Lys Pro Val 35 40 45 Val Ser Leu PheGln Lys Met Arg Ile Glu Val Gly Leu Asp Leu 50 55 60 Leu His Phe Ala LeuAla Glu Ala Asp Arg Leu Asp Val Leu Glu 65 70 75 Lys Tyr Arg Gly Lys CysGlu Pro Thr Phe Leu Phe Tyr Ala Gly 80 85 90 Gly Glu Leu Val Ala Val ValArg Gly Ala Asn Ala Pro Leu Leu 95 100 105 Gln Lys Thr Ile Leu Asp GlnLeu Glu Ala Glu Lys Lys Val Leu 110 115 120 Ala Glu Gly Arg Glu Arg LysVal Ile Lys Asp Glu Ala Leu Ser 125 130 135 Asp Glu Asp Glu Cys Val SerHis Gly Lys Asn Asn Gly Glu Asp 140 145 150 Glu Asp Met Val Ser Ser GluArg Thr Cys Thr Leu Ala Ile Ile 155 160 165 Lys Pro Asp Ala Val Ala HisGly Lys Thr Asp Glu Ile Ile Met 170 175 180 Lys Ile Gln Glu Ala Gly PheGlu Ile Leu Thr Asn Glu Glu Arg 185 190 195 Thr Met Thr Glu Ala Glu ValArg Leu Phe Tyr Gln His Lys Ala 200 205 210 Gly Glu Glu Ala Phe Glu LysLeu Val His His Met Cys Ser Gly 215 220 225 Pro Ser His Leu Leu Ile LeuThr Arg Thr Glu Gly Phe Glu Asp 230 235 240 Val Val Thr Thr Trp Arg ThrVal Met Gly Pro Arg Asp Pro Asn 245 250 255 Val Ala Arg Arg Glu Gln ProGlu Ser Leu Arg Ala Gln Tyr Gly 260 265 270 Thr Glu Met Pro Phe Asn AlaVal His Gly Ser Arg Asp Arg Glu 275 280 285 Asp Ala Asp Arg Glu Leu AlaLeu Leu Phe Pro Ser Leu Lys Phe 290 295 300 Ser Asp Lys Asp Thr Glu AlaPro Gln Gly Gly Glu Ala Glu Ala 305 310 315 Thr Ala Gly Pro Thr Glu AlaLeu Cys Phe Pro Glu Asp Val Asp 320 325 330 24 276 PRT Homo sapiensmisc_feature Incyte ID No 7715945CD1 24 Met Cys Ile Ile Phe Phe Lys PheAsp Pro Arg Pro Val Ser Lys 1 5 10 15 Asn Ala Tyr Arg Leu Ile Leu AlaAla Asn Arg Asp Glu Phe Tyr 20 25 30 Ser Arg Pro Ser Lys Leu Ala Asp PheTrp Gly Asn Asn Asn Glu 35 40 45 Ile Leu Ser Gly Leu Asp Met Glu Glu GlyLys Glu Gly Gly Thr 50 55 60 Trp Leu Gly Ile Ser Thr Arg Gly Lys Leu AlaAla Leu Thr Asn 65 70 75 Tyr Leu Gln Pro Gln Leu Asp Trp Gln Ala Arg GlyArg Gly Glu 80 85 90 Leu Val Thr His Phe Leu Thr Thr Asp Val Asp Ser LeuSer Tyr 95 100 105 Leu Lys Lys Val Ser Met Glu Gly His Leu Tyr Asn GlyPhe Asn 110 115 120 Leu Ile Ala Ala Asp Leu Ser Thr Ala Lys Gly Asp ValIle Cys 125 130 135 Tyr Tyr Gly Asn Arg Gly Glu Pro Asp Pro Ile Val LeuThr Pro 140 145 150 Gly Thr Tyr Gly Leu Ser Asn Ala Leu Leu Glu Thr ProTrp Arg 155 160 165 Lys Leu Cys Phe Gly Lys Gln Leu Phe Leu Glu Ala ValGlu Arg 170 175 180 Ser Gln Ala Leu Pro Lys Asp Val Leu Ile Ala Ser LeuLeu Asp 185 190 195 Val Leu Asn Asn Glu Glu Ala Gln Leu Pro Asp Pro AlaIle Glu 200 205 210 Asp Gln Gly Gly Glu Tyr Val Gln Pro Met Leu Ser LysTyr Ala 215 220 225 Ala Val Cys Val Arg Cys Pro Gly Tyr Gly Thr Arg ThrAsn Thr 230 235 240 Ile Ile Leu Val Asp Ala Asp Gly His Val Thr Phe ThrGlu Arg 245 250 255 Ser Met Met Asp Lys Asp Leu Ser His Trp Glu Thr ArgThr Tyr 260 265 270 Glu Phe Thr Leu Gln Ser 275 25 232 PRT Homo sapiensmisc_feature Incyte ID No 7025368CD1 25 Met Ala Gly Asp Thr Glu Val TrpLys Gln Met Phe Gln Glu Leu 1 5 10 15 Met Arg Glu Val Lys Pro Trp HisArg Trp Thr Leu Arg Pro Asp 20 25 30 Lys Gly Leu Leu Pro Asn Val Leu LysPro Gly Trp Met Gln Tyr 35 40 45 Gln Gln Trp Thr Phe Ala Arg Phe Gln CysSer Ser Cys Ser Arg 50 55 60 Asn Trp Ala Ser Ala Gln Val Leu Val Leu PheHis Met Asn Trp 65 70 75 Ser Glu Glu Lys Ser Arg Gly Gln Val Lys Met ArgVal Phe Thr 80 85 90 Gln Arg Cys Lys Lys Cys Pro Gln Pro Leu Phe Glu AspPro Glu 95 100 105 Phe Thr Gln Glu Asn Ile Ser Arg Ile Leu Lys Asn LeuVal Phe 110 115 120 Arg Ile Leu Lys Lys Cys Tyr Arg Gly Arg Phe Gln LeuIle Glu 125 130 135 Glu Val Pro Met Ile Lys Asp Ile Ser Leu Glu Gly ProHis Asn 140 145 150 Ser Asp Asn Cys Glu Ala Cys Leu Gln Gly Phe Cys AlaGly Pro 155 160 165 Ile Gln Val Thr Ser Leu Pro Pro Ser Gln Thr Pro ArgVal His 170 175 180 Ser Ile Tyr Lys Val Glu Glu Val Val Lys Pro Trp AlaSer Gly 185 190 195 Glu Asn Val Tyr Ser Tyr Ala Cys Gln Asn His Ile CysArg Asn 200 205 210 Leu Ser Ile Phe Cys Cys Cys Val Ile Leu Ile Val IleVal Val 215 220 225 Ile Val Val Lys Thr Ala Ile 230 26 120 PRT Homosapiens misc_feature Incyte ID No 7500954CD1 26 Met Lys Asn Asp Thr AspThr Gly Asn Met Gln Lys Glu Val Met 1 5 10 15 Ser Val Thr Glu Gln ValGlu Lys Lys Lys Asn Asp Ile Glu Lys 20 25 30 Asp Asp Thr Gly Arg Lys ArgLys Pro Asp Ile Ser Leu Leu Glu 35 40 45 Val Ile Val Asp Val Ala Met LysVal Lys Lys Glu Ile Val Thr 50 55 60 Gly Asp Thr Asn Thr Lys Asn Leu LysGlu Ala Lys Lys Glu Lys 65 70 75 Lys Arg Ala Val Ser Leu Pro Leu Asn ArgArg Ala Pro Lys Leu 80 85 90 His Leu Gln Asn Arg His Gly Phe Gly Leu LeuCys Ile Leu Val 95 100 105 Pro Glu Val Asp Thr Ile Asn Leu Val Ile PheLeu Asp Asn Val 110 115 120 27 675 DNA Homo sapiens misc_feature IncyteID No 6242606CB1 27 attctgggca ttttcagagt tttctcttcc atgtgaattgctccagctaa cttaatatta 60 gtttctcaag ggtttgtgta gacctggctt tgtctctttccatctgtttc tctctttccc 120 tttaacgtca cttgcctgag gtctataagg ccctttaaggttgccagccc agagccagaa 180 gagattggtg tccctgaagt taagaaagcc acctttgcctacagatggtg ttcacagtgt 240 tggtgtggag tcacagcttg acttttgggg gccccaggagatgttgaccc agcagggcat 300 ggcgctgcag aactacgaca acaagctggt caaatgcatagaggagctat gccagaagca 360 ggaggagctg tgctggcaga tccagcagga ggaggacaagaaacagcggc tgcagaatga 420 ggtgaggcag ctgacagaga agctggcctg cgtcaacgagaagctggccc gcgtcaacga 480 gaacctggca cgcaagattg cctcttgcag taagttctaccagaccatcg cggagacgga 540 ggccacctac ctcaagatgc tggagagctc ccagactttgctcagtgtcc tgaagaggga 600 agctgggaac ctgaccaagg ctacagcctc agaccagaaaagtagtggtg gcagggacag 660 ctgaccagac catgg 675 28 1387 DNA Homo sapiensmisc_feature Incyte ID No 997105CB1 28 gcaacattct attttgcgta ggtgtcgggagagctttctt ttgcgatttg ttccttgatc 60 cggaagaaaa gatagcaaat atcacttgccattggattgc atcataaggg ttgggcttta 120 gcgagccagt caccttggca accggacgtgacgaccccgg aagacgcccg accgagcttt 180 gctccagcta tcggcttcag tgggcggcgcggcagagcca gactgacgca tcttcagcac 240 tgaccaggct ctgttggggg tggtgccaccaaggccacgc cccctgcgcc tgctccggcg 300 gagtctggag agttgtcgta ggctctgatggcggagactc acgccgattc tctctcagcg 360 tagggctgtg agggcgcagt ccaccgccaggagccttccg gtttctgcgc ggtgcgcgac 420 ctcgtcccga agcctgggga tacaccctctcgagagcccg ctgtcgccct ccgttaaggt 480 cgaacccctc acagttgctg tgggcaactccagcccaaca ttccctcgct ctggttctcg 540 ccccattggg aaactcggcc ccacgcttcccacttttctg gatgaggtgt cccctttctc 600 cccactaaaa tgtcaaataa cctacggagggtcttcctga aacccgcaga ggaaaattca 660 ggcaacgcct cgcgttgtgt ttcaggctgcatgtaccaag tagttcagac gattggctcg 720 gatggaaaaa atcttctgca attacttccaattcctaagt cttctggaaa tcttatacca 780 ctagttcaat cttcagtcat gtctgatgctttgaaaggga atacaggaaa accagttcaa 840 gttacttttc agactcagat ttccagctcttccacaagtg catcagttca attgcccatt 900 tttcagccag ccagttcttc aaactattttcttacaagaa cagtagatac atcagaaaaa 960 ggtagagtta cttctgtggg aactggaaatttttcttcat cagtttctaa agttcagagt 1020 catggtgtga aaattgatgg actcaccatgcaaacatttg ctgttcctcc ctcaacacaa 1080 aaagactcat cttttattgt agttaatacccagagtcttc cagtgactgt gaagtctcca 1140 gttttgcctt ctgggcatca tttacagattccagcccatg ctgaagtgaa atctgtacca 1200 gcgtcatcat tgcctccttc agtgcagcaaaagatacttg caactgccac caccagtacc 1260 tcaggaatgg ttgaggcctc ccaaatgccaaccgttattt atgtatctcc tgtaatcctt 1320 acactttgga agtctgaggc aggaggatcacttgaggcca ggagttcggc tgggcaatgg 1380 caataga 1387 29 1534 DNA Homosapiens misc_feature Incyte ID No 1482843CB1 29 ggttttgagg attgggacttctagggtatt tttttttttg ccggcccctc caggtttatt 60 agtgttgcat cagcaccttgaaatagtcca cacggtcaca tacgtgacat tagaatccgt 120 tccccaaaaa gctatttacattaatccagg gtttcattaa aaaaaactag caacacatac 180 atcctagaaa ccaaacaatgcacaatgaac cgcatggggg tctgcagaaa ctacagtggt 240 gcttctttgg cccacagggtcagattccag aggtgcccag caagcccagc tctcagtccc 300 ctacaggcca cacacccgcggacggcagga caggtctcgg ggagggggcg gcctggccct 360 cggaagaagt ggcctggtcattttcatgag gaattcattc ccggaggaag gcgagaggca 420 ctcaggtctt tccatctgggaatgaggaca gtgtggcaag gctgccagta ccctgcagag 480 acacccccag ctctcacccagaggcccctc tcctgcccca gctctggccg aggcactgaa 540 ccaggacact ggcttgcatgcccagagaca gtggccaagg ggaccaacat ggaagaggaa 600 ggcagcagac cagattgcagggcgggactg gcccacagct ggcctgggtg agggtggggt 660 ctgtctacgt aatagtggacagtggccacc aggcctgagg ggtgcctcgg tggtgactgg 720 tggcctaacc tttgagactgatggggctga gccctagctc tacaaggggc tgcctcctgt 780 cctgtgaggc cagtccagggccaggttctc ggggggtgtg cacacctcac acacacatcc 840 cccgccctta ccgcactggctgtcagatgg ctttggtttc tagttgtcct tggtgggaca 900 gagtgctcct ccatgcaccctcagaacagc tctgtgaggt aggcagggaa aatgcgacca 960 tcctgtgtaa ccagtgggaaaaccgaggca cagagggctg gcatggccca ggtcttgtct 1020 gcccaacctc cctggtgctgggcatggtgt gggggatgga ccgtggggct cctgggtgca 1080 gggcagaccc acagcccagcagcagcaggg gcacagggag tgccaccaga acccccagct 1140 gctggatgga gactctgcccatgttctccc cttgcgtggc taaggatgga gctgtgcagc 1200 cagatgggag gtacggggtaggcaggaggg ctccaggaaa ggggctgccc acagccagga 1260 tgcactaggc agcctgtggccccttcctct aaaacctgct ctgaggccag aaagccacag 1320 caaggatgtt tccaccttcatctccctact gagtgtgacc cttgggaaag aaacaggaag 1380 agcaggcagg acgcggtgactcacgcctgt aatcccagca ctttgggagg ccgaggcggg 1440 tggatcactt gaggtcaggagttcaagacc agcctggcca acatggcaaa accccatctc 1500 tactaaaaat acaaaaaaaatagccaggtg tggt 1534 30 4203 DNA Homo sapiens misc_feature Incyte ID No3218219CB1 30 ccgggctccg tcccgtgtaa ccgccagtct cagaagccta taacaggctaccatttttat 60 gaagctcaaa aacgaaaatg aatttagtga tacagtgttt agacacatacttttgtaaga 120 aaacttaaaa acttggggga accataaaca taaaattgaa gacagtggttacctttgggg 180 tgagggaagc tatgcggtat agacccggct agaatctgaa gtgcgggagagtgctggacc 240 ctgagtgatg gggccggcgc cagctggaga gcagcttcgc ggagcgactggagagccaga 300 ggtgatggaa ccagccctgg aaggcacagg caaagagggg aagaaagcatcctccaggaa 360 gcgtacattg gctgaacctc cagcgaaggg cctcctgcag ccagtgaagctcagcagggc 420 agaactgtac aaggagccta ccaatgagga gcttaatcgc cttcgggagactgagatctt 480 gttccactcc agcttgcttc gtttacaggt agaggagcta ctaaaggaagtaaggctgtc 540 agagaagaag aaggatcgga ttgatgcctt cctacgggag gtcaaccagcgggttgtgag 600 ggtgccctca gtccctgaga cagagctcac tgaccaggca tggctcccagctggggttcg 660 agtgcccctc caccaagtgc cctatgccgt gaagggctgt ttccgcttcctgcccccagc 720 ccaggttact gttgtgggca gctaccttct gggcacctgc atccgaccagacatcaatgt 780 ggatgtggca ctgaccatgc ccagggaaat cctacaggac aaggacgggctgaaccagcg 840 ctacttccgc aagcgtgccc tctacctggc ccacttggct caccacctggcccaggaccc 900 cctctttggc agtgtttgct tctcctacac aaatggctgc cacctgaaaccctcactgtt 960 gctgcggccg cgtggaaagg atgagcgcct ggtcactgta cgtctgcatccgtgccctcc 1020 acctgacttc ttccgcccgt gccgcttgct gccaaccaag aacaatgtgcgctctgcctg 1080 gtaccgaggg cagagtcctg caggggatgg tagcccagag cctcctaccccccgctataa 1140 cacatgggtc ctgcaagata cagttctcga gtcccatttg cagctgctgtcaaccattct 1200 gagttcagcc cagggcctga aggatggcgt ggcacttctg aaggtctggctgcggcagcg 1260 ggagctggac aagggccagg gtgggtttac tgggttcctt gtctccatgctggttgtctt 1320 ccttgtgtct acacgcaaga tccataccac catgagtggc taccaggtcctgagaagtgt 1380 cttgcagttt ctggccacta cagacctgac agtcaacggg atcagtttatgtctcagctc 1440 agatccctct ttgccggccc tggctgactt ccaccaggcc ttctccgttgtcttcctgga 1500 ttcctcaggc catctcaacc tctgtgctga tgtcactgcc tctacttaccaccaggtaca 1560 gcatgaggca cggctgtcta tgatgttgct ggacagcaga gctgacgacgggttccacct 1620 gctgttgatg actcccaaac ccatgatccg ggcttttgac catgtcctgcatctccgtcc 1680 actgagtcgc ctgcaggcag cgtgccaccg gctgaagctc tggccagagctgcaggacaa 1740 tggtggggac tatgtctcag ctgctttggg ccccctgacc accctcctggagcagggcct 1800 gggggctcgg ctgaacctgc tggctcactc tcgaccccca gtcccagagtgggacatcag 1860 ccaagatcca ccaaagcaca aagactctgg gaccctgacc ctgggactccttctccggcc 1920 tgagggactg accagcgtcc ttgagctggg tccagaggca gaccagcctgaggctgctaa 1980 attccgccag ttctggggat cccgctcgga gcttcggcgt ttccaggacggagccattcg 2040 ggaagctgtg gtctgggagg cagcctctat gtcccagaag cgccttattccccaccaggt 2100 ggtcacccat ctcttggcac tccatgctga catcccagaa acctgtgtccactatgtggg 2160 gggccccctg gatgcactta tccaaggcct gaaagagacc tccagcacaggtgaggaggc 2220 cctggtagcg gcggtacgtt gctacgacga cctcagtcgc ctactgtgggggctagaggg 2280 tctcccactg accgtgtctg ctgttcaggg agctcaccca gtgctgcgctacacagaggt 2340 gttcccacca actccagtcc gtccagcctt ctccttctat gagactctgcgggagcggtc 2400 ctcactgctg ccccggctcg ataagccctg tccggcctac gtggagcccatgaccgtggt 2460 ttgtcacctg gagggcagtg gccagtggcc acaggacgct gaggccgtgcagcgggtccg 2520 agctgccttc cagctgcgcc tggcagagct gttgacacaa cagcatggtctgcagtgccg 2580 tgccactgcc acgcacacgg atgtccttaa ggatggattt gtgtttcggattcgcgtggc 2640 ctatcagcgg gagccccaga tcctgaagga ggtgcagagc ccagaggggatgatctcgct 2700 gagggacaca gctgcctccc tccgccttga gagagacaca aggcagttgccactgctcac 2760 cagtgccctg cacggactgc agcagcagca cccagccttc tctggtgtggcacggctggc 2820 caagcggtgg gtgcgtgccc agcttcttgg tgagggtttc gctgatgagagcctggatct 2880 ggtggccgct gcccttttcc tgcaccctga gcccttcacc cctccgagttccccccaggt 2940 tggcttcctt cgattccttt tcttggtatc aacgtttgat tggaagaacaaccccctctt 3000 tgtcaacctc aataatgagc tcactgtgga ggagcaggtg gagatccgcagtggcttcct 3060 ggcagctcgg gcacagctcc ccgtcatggt cattgttacc ccccaagaccgcaaaaactc 3120 tgtgtggaca caggatggac cctcagccca gatcctgcag cagcttgtggtcctggcagc 3180 tgaagccctg cccatgttag agaagcagct catggatccc cggggacctggggacatcag 3240 gacagtgttc cggccgccct tggacattta cgacgtgctg attcgcctgtctcctcgcca 3300 tatcccgcgg caccgccagg ctgtggactc gccagctgcc tccttctgccggggcctgct 3360 cagccagccg gggccctcat ccctgatgcc cgtgctgggc tatgatcctcctcagctcta 3420 tctgacgcag ctcagggagg cctttgggga tctggccctt ttcttctatgaccagcatgg 3480 tggagaggtg attggtgtcc tctggaagcc caccagcttc cagccgcagcccttcaaggc 3540 ctccagcaca aaggggcgca tggtgatgtc tcgaggtggg gagctagtaatggtgcccaa 3600 tgttgaagca atcctggagg actttgctgt gctgggtgaa ggcctggtgcagactgtgga 3660 ggcccgaagt gagaggtgga ctgtgtgatc ccagctctgg agcaagctgtagacggacag 3720 caggacattg gacctctaga gcaagatgtc agtaggatga cctccaccctccttggacat 3780 gaatcctcca tggagggcct gctggctgaa catgctgaat catctccaacaaaacccagc 3840 cccaactttc tctctgatgc tccagcattg gggcaggggc atggtggcccatgtagtctc 3900 ctgggcctca ccatcccaga agaggagtgg gagccagctc agagaaggaactgaacccag 3960 gagatccatc cacctattag ccctgggcct ggacctccct gcgatttcccactcctttct 4020 tagtcttctt ccagaaacag agaaggggat gtgtgcctgg gagaggctctgtctccttcc 4080 tgctgccagg acctgtgcct agacttagca tgcccttcac tgcagtgtcaggcctttaga 4140 tgggacccag cgaaaatgtg gcccttctga gtcacatcac cgacactgagcagtggaaag 4200 ggg 4203 31 2082 DNA Homo sapiens misc_feature Incyte IDNo 7371678CB1 31 cggttgctct ggagagggtt gtggcactcg gggctgggtg gctctccctaagcttgctct 60 gacaaaagag ttttgagttg gtcttttggc tgagcctgcc caggaaggcaggctcctgca 120 ggagtcttgg agggtcggat gcggcgccgg atgaggatga ggcgtggcggaagatgcgtt 180 tggctctgca gacactgcat cgggcagcag gggactctgg gaggctggtgcagccagaag 240 gcatggctct tgacagcctt ctagtagaat ctctggaatt gtgcatgtcccccccgcctc 300 cagccagcct gtgcagactt gctgcctgct gtgtcaccgg gaacgcaaaggctgggaaga 360 aggcccttct caaaatggac tggtgttgca gggtgagaag ctgccccctgacttcatgcc 420 aaagctcgtc aagaatctcc taggcgagat gcctctgtgg gtctgccagagttgccgaaa 480 gagcatggag gaagatgaaa ggcagacagg tcgagaacat gcagtggcgatctccttgtc 540 acacacatcc tgcaaatcac agtcttgtgg agatgactct cattcgtcctcgtcttcctc 600 ctcatcatcc tcatcctcgt cctcctcttc ctgccctggg aactcgggagactgggatcc 660 tagctcgttc ctgtcggcac ataagctctc gggcctctgg aattccccacattccagtgg 720 ggccatgcca ggcagctctc ttgggagtcc tcctaccatc cccggtgaggctttccccgt 780 ctcggagcac caccagcact cagacctcac tgctccccct aacagccccaccggccacca 840 cccgcagcca gcatctctaa tcccgtctca ccccagctcc tttggctccccaccccaccc 900 acacctgctg cccaccaccc cggcagcacc tttccctgcc caggcttcagagtgccctgt 960 tgctgctgcc actgcccccc acactccagg gccatgtcag agctcccatctaccctccac 1020 cagcatgccg ctcctgaaga tgcccccacc attctcgggg tgcagccacccctgcagcgg 1080 gcactgtggt gggcactgca gtgggcctct cctcccaccc ccgagctctcagccactccc 1140 tagcactcac agggatcccg ggtgcaaggg gcacaagttt gcacacagtggcctggcttg 1200 ccagctgccc cagccctgcg aggcagatga ggggctgggt gaggaagaggatagcagctc 1260 tgagcgaagc tcctgcacct catcctccac ccaccagaga gatgggaagttctgtgactg 1320 ctgctactgt gagttcttcg gccacaatgc ggtgagtgag cctgcccaggctagagaggg 1380 ctgggggcca gccaaggtga gcaaaaggag caccctgctc tccccagagaccccgttgct 1440 cacggtggta attaccgtgc tggggccgtg agtccttgca cctgcgtctgctcgcttgat 1500 cctcacagca catttgtgag gtaggggttt ttaaacattt tgctacctttgctcttctca 1560 tggtttctct taagattgac tttttttcct taggcaaaag gaaaggaaatggcagagaga 1620 aagctatgat tctgatgagt atgtatacgt gtgtaatccc agagaagtgaacgcttggga 1680 gtgatgaagg cagagtggaa gcaaaaaggc tctcagtccc ccaagtgtgacagccagccg 1740 agggacaggc cgtgagcaca gacggcgcca ggaaggaggc tcagatcagagggcatgctg 1800 gctctggcca gggggaggaa gcagtgcaga agtctcataa gccacccgctgccccgacga 1860 gtcggaacta taccgagatc cgggagaagc tccgctcgag gctgaccaggcggaaagagg 1920 agctgcccat gaaggggggc accctgggcg ggatccctgg ggagcccgccgtggaccacc 1980 gagatgtgga tgagctgctg gaattcatca acagcacgga gcccaaagtccccaacagcg 2040 ccagggccgc caagcgggcc cggcacaagc tgaaaaaaaa aa 2082 321459 DNA Homo sapiens misc_feature Incyte ID No 7473883CB1 32 gctacctagtgggtcttggg gaccttcgaa atcgccgccg ctctcacaat ggcttgggtc 60 cagactgcgccacagcctct cgggagacgt gggccctcgg aaccttttta gtgccggact 120 ccgggccgcaggcagtcccg cggcagcagg atcacagctc ttgaggctag tgcgcctcta 180 ggcaagatgtccctgcccat cgggatatac cgccgggcag tcagctatga tgataccctc 240 gaggaccctgcgcccatgac tcctcctcca tcggacatgg gcagcgtccc ttggaagcca 300 gtgattccagagcgcaagta tcagcacctc gccaaggtgg aggaaggaga ggccagtcta 360 ccctcccctgccatgaccct gtcatcagcc attgacagtg tggacaaggt cccagtggtg 420 aaggctaaagctacccatgt catcatgaat tctctgatca caaaacagac ccaggaaagc 480 attcagcattttgagcgaca ggcagggctg agagatgctg gctacacacc ccacaagggc 540 ctcaccaccgaggagaccaa gtaccttcga gtggccgaag cactccacaa actaaagtta 600 cagagtggagaggtaacaaa agaagagagg cagcctgcat cagcccagtc caccccaagc 660 accactccgcactcttcacc taagcagagg cccaggggct ggttcacttc tggttcttcc 720 acagccttacctggcccaaa tcctagcacc atggactctg gaagtgggga taaggacaga 780 aacttgtcagataagtggag cctctttgga ccgagatccc ttcagaagta cgattctgga 840 agttttgccacccaggccta ccgaggagcc cagaagccct ctccattgga actgatacgt 900 gcccaggccaaccgaatggc tgaagatcca gcagccttga agccccccaa gatggacatc 960 ccagtgatggaaggaaagaa acagccacca cgggcccata acctcaaacc ccgtgacctg 1020 aatgtgctcacacccactgg cttctagagc cctctttcca gggattctgg taaaggtggt 1080 ttcttgcatcccactcccct tttaccttgg ctttgacata ggaaaggtat atttaaaaac 1140 ttaatcagctgggcgtggtg gctcacgcct gtaatcccag cactttggga ggccaaggta 1200 ggtggatacctgaggtcagg agttcaagac cagcctggcc aacatggtga aaccccgtct 1260 ctactaaaaatacaaaaatt agctgggcgt ggtggtgggc gcctgtagtc ccagctactt 1320 gggaggctgaggcaggagaa tcgcctgaac ccaggaagca gatgttgtac cgagctgaga 1380 tcatgccattacactccagc ctgggcgaca gaatcgagac tttccagcac actgcgcncg 1440 tacaagtgagccgagctcg 1459 33 1806 DNA Homo sapiens misc_feature Incyte ID No7478662CB1 33 acttcctcca tcctgtccag caggatgggc tggacagtgg gacagcctgtgtgcacattt 60 tgtggcaagt aggagtgaca caccatccct gggaggcacc atggttcctgccaaacccaa 120 ccccagaact ctgtccctga ggtggtttta ccaaaaccca aaacccagaattgtgcttgt 180 ggctcagggg tcagcacctg ctagtaccag gacactactg ggaggctgggacctgaccga 240 agcccatggt gtctgtggcc tgaggacagg gtgtgttggg gccataagtcccggccacca 300 atggccattg ggtcctaggg cctcagcccc agtgtttgcc cttccctggctccttctggt 360 tcagtcccat tagggccctg gagcccaaga cccagcaccc aaggtcccctccaggaatac 420 tggcggcttg gcttccttta tcatgtttca tctgagagca aaaatgtcagatcggatgca 480 cagaaaaatg gcccaaattg tttaataact agaagaaata taggagcagcaagagggtaa 540 tatggagagg ggagggcctc catgaccggt gtctgcagag ccaggggtacaggcacccag 600 tgttgtggcc tggcaccacc ggcctctcag agcgtgggtg gcccacgtgtctttacccgg 660 aggacagcag gcctgatcac cagcttttct acctgtccct gtaagcatcacgttgctaga 720 agaaaatctc atgccagagc ttgcaccatc cctagcttgg gggttaggggttgtctcttg 780 gtgacctaaa tgaaaaaata ggtccagatc agagttcctg acgcagagcactcacccact 840 ctttgaatcg tgggagggga ggcctggttt tagttaaacc taacctctttgaggaaccac 900 agagcccaag actggaagcc ttcagaatct tctggccccc aaccctccctggggacccct 960 gtggcctgtc tcaccagagc actcttctgt ctgtagatgt ctcagctgctctacaaggga 1020 gtcccatttc aggtgtgggg ctgggcatgg tcactcctgc tggatgtctagaaggtggaa 1080 actaaggacc taggaaaatg ccagatacag cctttccacc ctcatccagagcaggacaaa 1140 caggcccggt ggtgtcagga gcccaggtct ccagctggag ggaacgtcaaccctgcagtg 1200 ggagcagggg cccatcgcac atcctaggca cagatgctaa tgcaggcactgcaggtaagc 1260 tgggcttggt atccttccct ggcttcagaa agaagccaac aaggagcgttttgcagaatg 1320 aaacctttgt ttccagaagc actgctgact gtaagtggtt gccgtttgtggcagtgagca 1380 ttttgtccat tctgaggttg gattggtttc tccttttggc cttgccctgccctacagacc 1440 ataaaggaga acagcaagaa gcccccagca aacatccaca gatggccctggacatcagcc 1500 acattctgag gaacatgtca tgttctggga gggctaaggc atcaagtaaggcctgtgggg 1560 ctggaggatc ccaggcaagg tggggcaatc cagagccatg ggggcttcccatgggaattg 1620 ggaggtccca aggcagagtc agaggttcca caggaggagt cagagagtcaccaagggctc 1680 tcctggccca gggagcagtc aacaccatgg actgaacact tgctgggctccaacccttgg 1740 gccaggctgc ccatgtgggg ccaggaggca gctcagagtg ggagacagagagacaagtgt 1800 gctcag 1806 34 1698 DNA Homo sapiens misc_feature IncyteID No 7650474CB1 34 cacctgtcca ggacgacttg ttgattccca ggagggccgcctttccggtc tgggtccccg 60 agaggactgc cttgctcacc tgtcccctcg gcgcggccccggggagctcc cgagaggccc 120 ccgggatcgc tggccctccg aactccacag caatgagcaagttgggcaag ttctttaaag 180 ggggcggctc ttctaagagc cgagccgctc ccagtccccaggaggccctg gtccgacttc 240 gggagactga ggagatgctg ggcaagaaac aagagtacctggaaaatcga atccagagag 300 aaatcgccct ggccaagaag cacggcacgc agaataagcgagctgcatta caggcactaa 360 agagaaagaa gaggttcgag aaacagctca ctcagattgatggcacactt tctaccattg 420 agttccagag agaagccctg gagaactcac acaccaacactgaggtgttg aggaacatgg 480 gctttgcagc aaaagcgatg aaatctgttc atgaaaacatggatctgaac aaaatagatg 540 atttgatgca agagatcaca gagcaacagg atatcgcccaagaaatctca gaagcatttt 600 ctcaacgggt tggctttggt gatgactttg atgaggatgagttgatggca gaacttgaag 660 aattggaaca ggaggaatta aataagaaga tgacaaatatccgccttcca aatgtgcctt 720 cctcttctct cccagcacag ccaaatagaa aaccaggcatgtcgtccact gcacgtcgat 780 cccgagcagc atcttcccag agggcagaag aagaggatgatgatatcaaa caattggcag 840 cttgggctac ctaaactaaa acacattttt gatacctaaattaatgagct atagataaaa 900 tataaaaaat gtttttacca agttcagaag ttaacaaagactctgcttta taattatatt 960 gaatgaataa ttgtgtttta agcctcctaa gtaaaagtaaaaaaggagtc atgtgcatac 1020 atagaatcag tgatggaggc caggcacggt atctcatgcctgtaatccca gctacttggg 1080 aggctgagtc aggagaatcg cttgagccca ggagatggaggttgcagtga gccaagatca 1140 tgccactgca ctccagactg ggcaacagag ggagactccgtctcaaaaac taaaaaaaaa 1200 aaatacattt agtatagcgg gcggtggggg ggagaaataatgttatttcc tatgcgaatg 1260 acgtgtatcc ctgtacccat ggtaaatgta aatatactgtgtctcttttg ggagagcctt 1320 ttagtagagg agtcttatat gagtctctac ataagtagtttcacttgagt tttgcagttt 1380 gaaatcttaa aggagcttta attgacattt attataccaattaagcttgg aatggggcaa 1440 tggatgcatt tcccaaaacg tgtgaaagca ctaacagcttatattgctga atgagaatct 1500 cctgggtgta atttagccac ttagggaact gcgtgaacactcccaggcca ttatgatgct 1560 gttacagctt cagtgtataa atgcatgagt attctttctgttctgttttg tgctctcttg 1620 tacatttatt taccctttac agaatatttc ttgtaaatacataaaaatat tggcaattaa 1680 aagtacatct tgaataaa 1698 35 1753 DNA Homosapiens misc_feature Incyte ID No 8092378CB1 35 tttcacataa ctagaaggtgatagacagta tgacagtgga atacctcttg ttctcattta 60 ggacagtgtc tgaattgtagtgtgagagac ttaagttaga taatttagtg catttcctag 120 tagtgagagt tgttaattgttaacacttgg aaggaattgg tatcataagg aattgacatc 180 ataagttttg gaaatctttttctgctagga tgctttatag cctatctgaa tttgaaagtg 240 tagtgtagat taactgttttttagccatat tattctgtgg tattccattt aggtattgtg 300 aattattcag aatgcagaccccacgatata gaggtacaaa ttatattcct tccaaactaa 360 tacacaaact atattatgtgcttttctttg taaaacagtt tgttagggtt gtgctatttg 420 aacatatttt ctggtggatattctcagctt tgaacactgt acatttactg gtaagatttt 480 atttataaat tacacattttaaggaaattt ctatacttct aaaactcgaa atattttttg 540 aaagaggtaa ttgctatatatctgctttcc ttctaaaata tttcaaatat atctggtggc 600 tactgtttta tgtatggagaaggtaaggca ttgagattaa ctttctcata cttacaaaat 660 gggctgtgtg tggtgctgaaaataagaccg gtgtcctgac tcccagtttg ttttctgtgt 720 attagaccag actgcttacttaaagtttta ttgcgctaaa aatctgttca taagtttgag 780 ctccattttt ttctgtccattattacaaca tagagaagca cattcatatc cccagagaaa 840 tttagtgata tgcagcaactttgtttcact cttgacatta agtgtacatt gttaacattt 900 tttatcttat gattgaatgtttcaggtttt cctcctccac caggcgctcc acctccatct 960 cttataccaa caatagaaagtggacattcc tctggttatg atagtcgttc tgcacgtgca 1020 tttccatatg gcaatgttgcctttccccat cttcctggtt ctgctccttc gtggcctagt 1080 cttgtggaca ccagcaagcagtgggactat tatgccagaa gagagaaaga ccgagataga 1140 gagagagaca gagacagagagcgagaccgt gatcgggaca gagaaagaga acgcaccaga 1200 gagagagaga gggagcgtgatcacagtcct acaccaagtg ttttcaacag cgatgaagaa 1260 cgatacagat acagggaatatgcagaaaga ggttatgagc gtcacagagc aagtcgagaa 1320 aaagaagaac gacatagagaaagacgacac agggagaaag aggaaaccag acataagtct 1380 tctcgaagta atagtagacgtcgccatgaa agtgaagaag gagatagtca caggagacac 1440 aaacacaaaa aatctaaaagaagcaaagaa ggaaaagaag cgggcagtga gcctgcccct 1500 gaacaggaga gcaccgaagctacacctgca gaataggcat ggttttggcc ttttgtgtat 1560 attagtacca gaagtagatactataaatct tgttattttt ctggataatg tttaagaaat 1620 ttaccttaaa tcttgttctgtttgttagta tgaaaagtta actttttttt ccaaaataaa 1680 agagtgaatt tttcatgttaagttaaaaat ctttgtcttg tactatttca aaaataaaaa 1740 gacagcaatg att 1753 361489 DNA Homo sapiens misc_feature Incyte ID No 1263178CB1 36 cgccggtgggccgcagatga agaggaggcg gtggcagtgg tggaagaaga ggcggcggcg 60 gcgggggtagggagcctgga aacgcgagcg gggatggctt ttggacaaac tcttctgaaa 120 ggtgggcaacctagaaagcc agttgatgcc acttgctatt ggagttgtgg gttggctggc 180 tcctctggctgatggcatgt tgaggtacat gggccagcgg cagcgagggc atccaatcca 240 gaggggtccactctagaggc caggccacca gcaccatggg ccagtgtgtc accaagtgta 300 agaatccctcatcgaccctg ggcagcaaaa atggagaccg tgagcccagc aacaagtcac 360 atagcaggcggggtgcaggc caccgtgagg agcaggtacc accctgtggc aagccaggtg 420 gagatatcctcgtcaacggg accaagaagg ccgaggctgc cactgaggcc tgccagctgc 480 caacgtcctcgggagatgct gggagggagt ccaagtccaa tgccgaggag tcttccttgc 540 aaagattggaagaactgttc aggcgctaca aggatgagcg ggaagatgca attttggagg 600 aaggcatggagcgcttttgc aatgacctgt gtgttgaccc cacagaattt cgagtgctgc 660 tcttggcttggaagttccag gctgcaacca tgtgcaaatt caccaggaag gagttttttg 720 atggctgcaaagcaataagt gcagacagca ttgacggaat ctgtgcacgg ttccctagcc 780 tcttaacagaagccaaacaa gaggataaat tcaaggatct ctaccggttt acatttcagt 840 ttggcctggactctgaagaa gggcagcggt cactgcatcg ggaaatagcc attgccctgt 900 ggaaactagtctttacccag aacaatcctc cggtattgga ccaatggcta aacttcctaa 960 cagagaacccctcggggatc aagggcatct cccgggacac ttggaacatg ttccttaact 1020 tcactcaggtgattggccct gacctcagca actacagtga agatgaggcc tggccaagtc 1080 tctttgacacctttgtggag tgggaaatgg agcgaaggaa aagagaaggg gaagggagag 1140 gtgcactcagctcagggcct gagggcttgt gtcccgagga gcagacttag tggctctgtc 1200 ccaggagcagcagcaaggat ctgccagctg ccctgcagcc aactgaggaa ttggaccatt 1260 ttggaaattactgaagatcc ggatattttc tactttacac ctttctctgc cttgtatctg 1320 aaagggctctaaaatgctgt atcatgtttt aggcactttc ttcatttttt tggttatttt 1380 ggttatttcctttttggggg gatctcccag aatatttgaa cctggttaca tgttgtgtat 1440 ctttttttgaagccttcaga tagaataagc ctgccatttc ttgcacaaa 1489 37 1383 DNA Homo sapiensmisc_feature Incyte ID No 3535417CB1 37 gggaggtcac aatcacattg agccaaaacgcatccagtgt tttctccagt tacaaataaa 60 acgaatatgc ccatgctgct accacatcctcaccagcatt tcctaaaagg ccttttaaga 120 gcacctttcc gatgttacca cttcatctttcactcaagta ctcatctcgg atcaggaatc 180 ccatgtgctc agccgtttaa ttctcttggactccattgta caaagtggat gctgctgtca 240 gatggcttaa agagaaaatt atgtgtacaaacaaccttaa aggaccacac agaaggactt 300 tctgataaag agcaaagatt tgtggataaactttatactg gtttaatcca agggcaaagg 360 gcctgtttag cagaggccat aactcttgtagaatcaactc acagcaggaa aaaggagtta 420 gcccaggtgc ttcttcagaa agtattactttaccacagag aacaagaaca atcaaataaa 480 ggaaaaccac tagcatttcg agtaggattgtctgggcccc ctggtgctgg aaaatcaaca 540 tttatagaat attttggaaa aatgcttactgagagagggc acaaattatc tgtgctagct 600 gtggaccctt cttcttgtac tagtggtggatcactcttag gtgataaaac ccgaatgact 660 gagttatcaa gagatatgaa tgcatacatcaggccatctc ctactagagg aactttagga 720 ggcgtgacaa ggaccacaaa tgaagctattctgttgtgtg aaggagcggg atatgacata 780 attcttattg aaaccgttgg tgtgggtcagtcggagtttg ctgttgctga catggttgac 840 atgtttgttt tactactgcc accagcaggaggagatgagc tgcagggtat caaaaggggt 900 ataatcgaga tggcagatct ggtagctgtaactaaatctg atggagactt gattgtgcca 960 gctcgaagga tacaagcgga atatgtgagtgcactgaaat tactccgcaa acgttcacaa 1020 gtctggaaac caaaggtaat tcgtatttctgcccgaagtg gagaggggat ctctgaaatg 1080 tgggataaaa tgaaagattt ccaggacctaatgcttgcca gtggggagct gactgccaaa 1140 cgacggaagc aacagaaagt ttggatgtggaatctcattc aggaaagtgt gttagagcat 1200 ttcaggaccc accccacagt ccgggaacagattccacttc tggaacaaaa ggttctcatt 1260 ggggccctgt ccccaggact agcagcagacttcttgttaa aagcttttaa aagcagagac 1320 taataaaatt catcctgtat aataattttacatatcattt cataaagtat tttaatagaa 1380 aaa 1383 7 38 3519 DNA Homosapiens misc_feature Incyte ID No 7473555CB1 38 aaatgctgat gctacccaggtccttgccca cttctgtctt atcctgcata cttttcctga 60 atcactcaac taatttttagtttcaacaat tgcttgtata tgatgacccc atacctctcc 120 caccgagctc cagaactgtaagttacatta ctccctggat tgtgcatgtt ccaaactgag 180 ctcatggcac tcccccacacctattctttc ttgtactctg tgagcgacat cacttgtccc 240 ccaagacagg cccctggaagccatcttgtg accaatcctg tgacagttgt ggccccagta 300 gccccaggtg tcttacctgtactgagaaga cagtgctgca tgatgggaaa tgcatgtctg 360 aatgccctgg cgggtactatgctgatgcca ctggcaggtg caaagtttgt cataactcat 420 gtgccagctg ctctgggcccacaccctctc actgtacagc ctgcagcccc cccaaggctc 480 tgcgtcaagg ccactgtctgccccgctgtg gagagggttt ctactctgac cacggagtct 540 gcaaagaagc cagaggaaggactgcaagtg gagcagctgt ctggcgtggg catcccctct 600 ggcgagtgtc tagcccagtgtagagcccat ttttacttgg agagcactgg cctatgtgaa 660 gaacacccag aactttggaaattggctatt gccatgtctg aactgcgagt atgggaagga 720 aaaactattc tgactgttgttcccactacc atcccactct ctcagtatca acgtagaccc 780 agacattctg ctttacttacctccaacctg actgttccca tccagagttg tgttaagcct 840 ccttacatgc tgttagtaggaaatatcaaa atttggacaa ataaccaaat tgtccaatgc 900 attaattgtc atctatacacttgtattaac tcccattttg actccaggaa aagtgtaatg 960 ttggttcgag ctcgagaaggaatctggatt ccgcttgcca ccagtcctgt ttcagatgtg 1020 cagggaaaag cccacataactgcacagact gtgggccttc ccatgtgctg ttggatgggc 1080 agtgcctctc ccagtgcccagatggctact ttcaccagga aggtagttgc acagagtgtc 1140 acccaacctg caggcagtgtcatgggcctt tggagtctga ctgcatctcc tgttaccctc 1200 acatctctct taccaatggtaactgcagga ccagctgcag ggaagagcag ttcctcaacc 1260 tcgtgggata ctgtgctgactgccatcacc tgtgccagca ctgtgcagct gatctccaca 1320 acactgggag catctgcctcaggtgccaga atgcccacta cctgctgctc ggggaccact 1380 gtgttcctga ctgcccttcaggatactatg cagagagagg agcttgtaaa aaatgccact 1440 cctcctgcag aacctgccagggcagaggac ctttctcctg ctcctcatgt gacaccaacc 1500 tcgtgctgtc ccacactggcacctgcagca ccacctgctt ccctgggcac tatcttgatg 1560 acaatcatgt ttgccagcacccagagccaa tggagtattg aagtaggagt ggatgaccat 1620 ttcttagacc tacagcagaaaacaagtctg tttaagaagg tgattgaacc cccacactca 1680 catcacagat gtgatggttacagcacatcc cagcttccct cagtggtttc tggtcgactc 1740 tgccactttt tcactgtcctcctgcacttg ctgtgggggt cactgacctc agagaactgc 1800 ccctgctact ccaggtgtggccacaccaag atgtctgcgt cagtactaca tgcaacacac 1860 actgtggaag ctgtgattcacaggccagct gtacctcctg ccgagatcca aacaaggttc 1920 tgctctttgg ggaatgtcaatacgagagct gcgccccaca gtactatctt gacttctcca 1980 ccaacacgtg caaagtgcaaccgtaggttg aaacggtgta gctgtcgtcc ctgctggatg 2040 gaacaggatc tcttcagttcagcagtcccc accattccct ttctatcaca caaggcctat 2100 tctgaagata cattactttcctgggctgta ggtgtttgca tttacaactt gagattgcta 2160 cctccaagct tgataagtagttggaattct cggggcttac acactttgaa tggtagtgct 2220 cagcaacagg aaagcaaagctgcagacaga gtcctcataa atgaactgct gggtctgagg 2280 gtggacagaa aggaagataatctaatgcag actttctttt tagagtgtga ttggagctgc 2340 agtgcatgca gtgggcccctgaaaacagac tgcctgcagt gcatggatgg ctatgttctc 2400 caggatgggg cctgcgtggagcagtgcttg tcatcatttt accaggactc gggcctctgc 2460 aagaactgtg acagctactgtctccagtgc caaggtcccc atgagtgtac ccgctgcaaa 2520 gggccatttc tcctcttggaagcccagtgt gtccaggaat gtgggaaggg gtactttgca 2580 gatcatgcaa agcacaaatgcacagcctgc cctcaggggt gcttgcagtg cagccacagg 2640 gaccgttgtc acctctgtgaccatgggttc tttctgaaga gtggcctctg tgtttacaac 2700 tgtgttcctg gcttttctgtccacacctct aatgaaacat gttctggcaa aatacacacc 2760 cctagtcttc atgtgaatggttccctgatc ctcccaattg gttcaataaa gccactggat 2820 ttttccctcc tgaatgtccaagaccaggag ggtagggtca aagatctcct atttcatgtt 2880 gtgagcactc ccaccaatggtcagctagtg ctctcaagaa atggaaaaga ggttcagctg 2940 gacaaggctg gccgttttagctggaaagat gtgaacgaga agaaagtgcg ttttgtgcac 3000 agcaaagaaa aactcaggaaaggttacctt ttcctgaaaa ttagtgacca gcagttcttc 3060 tctgagccac agctgatcaacatacaagca ttttcaacac aggcccccta tgtgctgaga 3120 aatgaagttc tccacattagcagaggagag agggcaacca tcaccaccca gatgcttgac 3180 atccgagatg atgacaacccacaggatgtg gtcattgaaa taatcgatcc tccacttcat 3240 ggccaattgc ttcagacacttcagtccccg gcaaccccta tctatcaatt ccagctggat 3300 gaactctcta gaggccttctccactatgct catgatggtt cagacagcac atccgatgtt 3360 gcagtcttgc aggccaatgatggacactcc ttccataata tactgttcca agtgaagacc 3420 gtgcctcagg ggcagtcttatgatggtatg tcttctcttg atcctgcacc atcaagcagg 3480 gtgctgagag cctttattttcctcatgtta gttccctga 3519 39 2763 DNA Homo sapiens misc_feature IncyteID No 7474985CB1 39 atggcagatt catcatcatc ttctttcttt cctgattttgggctgctatt gtatttggag 60 gagctaaaca aagaggaatt aaatacattc aagttattcctaaaggagac catggaacct 120 gagcatggcc tgacaccctg gaatgaagtg aagaaggccaggcgggagga cctggccaat 180 ttgatgaaga aatattatcc aggagagaaa gcctggagtgtgtctctcaa aatctttggc 240 aagatgaacc tgaaggatct gtgtgagaga gcgaaagaagagatcaactg gtcggcccag 300 actataggac cagatgatgc caaggctgga gagacacaagaagatcagga ggcagtgctg 360 gtcatagtta acacaggggt ccccaactcc tgggccacagacccctactg gtcggcggcc 420 cctcgggaat caggtcgcat agcaggaggt gatggaacagaatacagaaa tagaataaag 480 gaaaaatttt gcatcacttg ggacaagaag tctttggctggaaagcctga agatttccat 540 catggaattg cagagaaaga tagaaaactg ttggaacacttgttcgatgt ggatgtcaaa 600 accggtgcac agccacagat cgtggtgctt cagggagctgctggagttgg gaaaacaacc 660 ttggtgagaa aggcaatgtt agattgggca gagggcagtctctaccagca gaggtttaag 720 tatgtttttt atctcaatgg gagagaaatt aaccagctgaaagagagaag ctttgctcaa 780 ttgatatcaa aggactggcc cagcacagaa ggccccattgaagaaatcat gtaccagcca 840 agtagcctct tgtttattat tgacagtttc gatgaactgaactttgcctt tgaagaacct 900 gagtttgcac tgtgcgaaga ctggacccaa gaacacccagtgtccttcct catgagtagt 960 ttgctgagga aagtgatgct ccctgaggca tccttattggtgacaacaag actcacaact 1020 tctaagagac taaagcagtt gttgaagaat caccattatgtagagctact aggaatgtct 1080 gaggatgcaa gagaggagta tatttaccag ttttttgaagataagaggtg ggccatgaaa 1140 gtattcagtt cactaaaaag caatgagatg ctgtttagcatgtgccaagt ccccctagtg 1200 tgctgggccg cttgtacttg tctgaagcag caaatggagaagggtggtga tgtcacattg 1260 acctgccaaa caaccacagc tctgtttacc tgctatatttctagcttgtt cacaccagta 1320 gatggaggct ctcctagtct acccaaccaa gcccagctgagaagactgtg ccaagtcgct 1380 gccaaaggaa tatggactat gacttacgtg ttttacagagaaaatctcag aaggcttggg 1440 ttaactcaat ctgatgtctc tagttttatg gacagcaatattattcagaa ggacgcagag 1500 tatgaaaact gctatgtgtt cacccacctt catgttcaggagttttttgc agctatgttc 1560 tatatgttga aaggcagttg ggaagctggg aacccttcctgccagccttt tgaagatttg 1620 aagtcattac ttcaaagcac aagttataaa gacccccatttgacacagat gaagtgcttt 1680 ttgtttggcc ttttgaatga agatcgagta aaacaactggagaggacttt taactgtaaa 1740 atgtcactga agataaaatc aaagttactt cagtgtatggaagtattagg aaacagtgac 1800 tattctccat cacagctggg atttctggag ttgtttcactgtctgtatga gactcaagat 1860 aaagcgttta taagccaggc aatgagatgt ttcccaaaggttgccattaa tatttgtgag 1920 aaaatacatt tgcttgtatc ttctttctgc cttaagcactgccggtgttt gcggaccatc 1980 aggctgtctg taactgtggt atttgagaag aagatattaaaaacaagcct cccaactaac 2040 acttggttgg aatactgtgg tttgacatct ctctgctgtcaagatctctc ctctgctctt 2100 atctgcaaca aaagactgat aaaaatgaat ctgacacagaataccttagg atatgaagga 2160 attgtgaagt tatataaagt cttgaagtct cctaagtgtaaactacaagt tctaggctta 2220 gaaagctgtg gtctcacaga ggctggctgt gagtatctttctttggctct catcagcaat 2280 aaaagactga cacatttgtg cttggcagac aatgtcttgggtgatggtgg agtaaagctt 2340 atgagtgatg ccctgcaaca tgcacaatgt actctgaagagccttgtatt gatgggctgt 2400 gttctcacta atgcatgttg tctggatctg gcttctgttattttgaataa cccaaacctg 2460 aggagcctgg accttgggaa caacgatttg caggatgatggagtgaaaat tctgtgtgat 2520 gctttgagat atccaaactg taacattcag aggctcgggttgctcattac aaacacaatc 2580 ctggaaaatg gcccagaact tcaactcagt gaccagcccagaaggctgag aaagaagagc 2640 atggaaatgg tcctctttca aggtgcttgc tggctaccacacactattga tctgagtggt 2700 gactcttctt tggccgatga aagtgaaagg attagcagtcatactgttct ccacattgtg 2760 tga 2763 40 1023 DNA Homo sapiensmisc_feature Incyte ID No 7475137CB1 40 atggggagaa accagagcgg aaaagctgaaaattctaaaa atcagagcgc ctcttctcct 60 ccaaaggact gcaactcccc gccagcaatggaacaaagct ggatagagaa tgactttgac 120 gagttgacag aagtaggctt cagaaggtcggtaataacaa actactcctc cgagctaaag 180 gagcatgttc aaacccattg caaggaagctaaaaaccttg aaaaatggtt agacgaatgg 240 ctaaatagaa taaacagtgt agagaagaccttaaatgacc tgaaggagct gaaaaccatg 300 gcacaagaac ttcgtgaggc atgcacaagcttcaatagcc aattcaatca agtggaagaa 360 agggtgtcag tgattgaaga tcaaattaatgaaataaagc gagaagacat ggttagagaa 420 aaaagattaa aaagaaacaa acaaagcctccaagaaatat ggtactatgt gaaaagacca 480 aatctacatt tgattagtgt acctgaaactgatggggcga acagaaccaa gttggaacac 540 actcttcatg atattatcca gaacttgcccaaaatagcaa ggcaggccaa cactcaaatt 600 caggaaatac agagaacacc acaaagatactcctcgagaa tagcaacccc aagacatgta 660 gtcagattca ccaaggttaa aatgaaggaaaaaatgttaa cggcagccag agagaaaggt 720 caggttaccc acaaagggaa gcccatcagactaacagcag atctcttggc agaaacccta 780 caagccagaa gacagtgggg gccaatattcaacattctta aagagaagaa ttttcaatcc 840 agaattccat atccagccaa actaagtttcataagtgaag gagaaataaa atcctttaca 900 gaaaaacaaa tgctgagaga ttttgtcaccaccaggcctg ccttacaaga gctcctgaag 960 gaagcactaa atatggaaag gaacaaccagtaccagccac tacaaaaaca tgccaaattg 1020 taa 1023 41 2046 DNA Homo sapiensmisc_feature Incyte ID No 5036986CB1 41 gtgctctctt ccaaggctgt aggagttctggagctgctgg ctggagagga gggtggacga 60 agctctctcc agaaagacat cctgagaggacttggcaggc ctgaacatgc attggctgcg 120 aaaagttcag ggactttgca ccctgtggggtactcagatg tccagccgca ctctctacat 180 taatagtagg caactggtgt ccctgcagtggggccaccag gaagtgccgg ccaagtttaa 240 ctttgctagt gatgtgttgg atcactgggctgacatggag aaggctggca agcgactccc 300 aagcccagcc ctgtggtggg tgaatgggaaggggaaggaa ttaatgtgga atttcagaga 360 actgagtgaa aacagccagc aggcagccaacgtcctctcg ggagcctgtg gcctgcagcg 420 tggggatcgt gtggcagtga tgctgccccgagtgcctgag tggtggctgg tgatcctggg 480 ctgcattcga gcaggtctca tctttatgcctggaaccatc cagatgaaat ccactgacat 540 actgtatagg ttgcagatgt ctaaggccaaggctattgtt gctggggatg aagtcatcca 600 agaagtggac acagtggcat ctgaatgtccttctctgaga attaagctac tggtgtctga 660 gaaaagctgc gatgggtggc tgaacttcaagaaactacta aatgaggcat ccaccactca 720 tcactgtgtg gagactggaa gccaggaagcatctgccatc tacttcacta gtgggaccag 780 tggtcttccc aagatggcag aacattcctactcgagcctg ggcctcaagg ccaagatgga 840 tgctggttgg acaggcctgc aagcctctgatataatgtgg accatatcag acacaggttg 900 gatactgaac atcttgggct cacttttggaatcttggaca ttaggagcat gcacatttgt 960 tcatctcttg ccaaagtttg acccactggttattctaaag acactctcca gttatccaat 1020 caagagtatg atgggtgccc ctattgtttaccggatgttg ctacagcagg atctttccag 1080 ttacaagttc ccccatctac agaactgcctcgctggaggg gagtcccttc ttccagaaac 1140 tctggagaac tggagggccc agacaggactggacatccga gaattctatg gccagacaga 1200 aacgggatta acttgcatgg tttccaagacaatgaaaatc aaaccaggat acatgggaac 1260 ggctgcttcc tgttatgatg tacaggttatagatgataag ggcaacgtcc tgccccccgg 1320 cacagaagga gacattggca tcagggtcaaacccatcagg cctataggca tcttctctgg 1380 ctatgtggaa aatcccgaca agacagcagccaacattcga ggagactttt ggctccttgg 1440 agaccgggga atcaaagatg aagatgggtatttccagttt atgggacggg cagatgatat 1500 cattaactcc agcgggtacc ggattggaccctcggaggta gagaatgcac tgatgaagca 1560 ccctgctgtg gttgagacgg ctgtgatcagcagcccagac cccgtccgag gagaggtggt 1620 gaaggcattt gtgatactgg cctcgcagttcctatcccat gacccagaac agctcaccaa 1680 ggagctgcag cagcatgtga agtcagtgacagccccatac aagtacccaa gaaagataga 1740 gtttgtcttg aacctgccca agactgtcacagggaaaatt caacgaacca aacttcgaga 1800 caaggagtgg aagatgtccg gaaaagcccgtgcgcagtga ggcgtctagg agacattcat 1860 ttggattccc ctcttctttc tctttcttttccctttgggc ccttggcctt actatgatga 1920 tatgagattc tttatgaaag aacatgaatgtaagttttgt cttgccctgg ttattagcac 1980 aaaacattac tatgttagat attgaaataaggaagaaaag aaagaggaga tgaaaggggg 2040 agaaaa 2046 42 857 DNA Homosapiens misc_feature Incyte ID No 1375644CB1 42 ggccccgcgc gcccaggcgaggcactgggc tccctctgct ccccgtgggc cgctccccgc 60 gtggggccac tgcccccggcccccgccatg gtgcggattt caaagcccaa gacgtttcag 120 gcctacttgg atgattgtcaccggaggtat agctgtgccc actgccgcgc tcacctggcc 180 aaccacgacg acctcatctccaagtccttc cagggcagtc aggggcgtgc ctacctcttc 240 aactcagtgg tgaacgtgggctgcgggcca gccgaggagc gggtgctgct gaccggcctc 300 catgctgtcg ccgacatccactgcgagaac tgcaagacca ctttgggctg gaaatatgaa 360 caggcctttg agagcagccagaagtacaaa gaggggaagt acatcattga actcaaccac 420 atgatcaaag acaacggctgggactgaccc ccgctccccc gacgcatgtg gctccagccc 480 ggcctggccg ccagggagcgccactggctt cccgccaccc gaagggagct ctggaccctc 540 agagcccctg cagaggacggatctagctcc tgtatatata ttttattgca tgcactgtga 600 ccttgggggg agaacagaagggggacgacg cccccgcacc tcctgcgatc tggctggctt 660 ggatctcgtt tttaacccgttcctgcccca cctgccctat agttatggcc tgtgttctgc 720 tctcctggtc ccagtgcaccgtctgctctg tgaactcctc ccaccaggcc cttcttaccc 780 cacgcgtgtc tgtccccctcgttctgtagc gtttgtacat aataaaacaa tggagtggag 840 acaaaaaaaa aaaaaaa 85743 1485 DNA Homo sapiens misc_feature Incyte ID No 1356261CB1 43gcggcctggc cttcaggcca ctggctaccg aaccccgggg ctcttcacca gtccagctcg 60tttccagcac catgtcggtg cggacgctac cgctgctctt cttgaacttg ggcggggaga 120tgctttacat cctcgaccaa cggctgcggg cccagaacat cccgggagac aaggcccgca 180aagatgaatg gacagaggtg gacagaaaac gagttctgaa tgacatcatc tccaccatgt 240tcaatagaaa gtttatggag gaattattca agcctcaaga gctctactcc aagaaggccc 300tgaggactgt ctatgagcgc ctggctcatg cctccattat gaaactgaac caggccagca 360tggataagct ctatgacctg atgaccatgg ctttcaaata tcaagtattg ctgtgtcccc 420gacccaagga tgtgctgctg gtcactttca atcacttgga taccatcaag ggattcatcc 480gagactcccc agccatcctg cagcaagtgg acgagacttt gcggcagctg acagaaatat 540atggtggtct ctctgcaggg gagttccagc tgatccggca gacactcctc atcttcttcc 600aagacctgca catccgagta tccatgtttc taaaggacaa agttcagaat aataacggtc 660gctttgtgtt gccggtgtcc gggcctgttc cttggggaac tgaagttcca ggactcatca 720gaatgttcaa caacaaaggt gaagaagtga agaggataga attcaagcat ggtggaaact 780atgtccctgc acccaaagaa ggttcttttg aactttatgg agaccgagtc ctgaaactgg 840gaactaacat gtacagcgtg aatcagcctg tggaaactca tgtgtctgga tcatcaaaga 900acttagcctc atggacccag gaaagcattg ctccaaaccc tcttgctaaa gaagagctga 960atttcttggc caggctgatg ggagggatgg agattaagaa acccagtggc cctgagccca 1020gattccggtt gaatctcttt accaccgatg aagaagagga acaagcagcg ctaaccaggc 1080cagaagagtt atcctatgaa gttatcaaca tacaagccac ccaggaccag caacggagcg 1140aggagctggc tcgaatcatg ggggagtttg agatcacgga gcagccaagg ctgagcacca 1200gcaaagggga cgatttgctc gccatgatgg atgagttata gctgttctga ccaggcgtcc 1260tctgccccca gggagaggct gctggatggt gacccctggg gaatgcccca tggcccagaa 1320tgatgctgct agttttctac tgagtgaagc cattacgtct atttcttatt tatgttgtaa 1380ggaactgtgt gagtctccct tgaggagcac tcactcttga aggcacacac atacacatat 1440tttcagtgaa atatattctg acttttaaac ttgaaaaaaa aaaaa 1485 44 989 DNA Homosapiens misc_feature Incyte ID No 1419379CB1 44 agaaaatggc ggcccccagaccgcactgcg ggcgtcgcgg caggtgaagc ggtggtggca 60 gaggcgcctg cggttactggccgggccgac gggtcctgag tctccagagc tcggttctta 120 catccccgcg tccccaggtttagcctgagc agggttgtgg aaggccggga cccattgaca 180 gcacgatggt gacggagcaggaggtggatg ccatcgggca gacgctggtg gaccccaagc 240 agcccctgca ggcccgcttccgggcgctgt tcacgctgcg tgggctcggc ggcccaggcg 300 ccattgcatg gatcagccaggccttcgatg acgattccgc cctgctcaag cacgagctgg 360 cctactgcct gggccagatgcaggatgccc gcgccatccc catgctggtg gacgtgctgc 420 aagacacccg tcaggagcccatggtgcgcc atgaggcagg ggaggccctg ggggccatcg 480 gggacccgga agttctggagatcctgaagc agtattcctc ggaccccgtc atcgaggtgg 540 ccgagacctg ccagctggccgtgcgcaggc tggagtggct gcagcagcac ggcggggagc 600 cggcggcggg accctacctctccgtggacc ctgccccgcc ggctgaggag cgtgacgtgg 660 ggcgcctgcg ggaggcgctgctggatgagt cccggccgct cttcgagcga taccgcgcca 720 tgttcgccct gcgcaacgcgggaggcgagg aggccgccct ggcgctggcc gagggtctgc 780 actgtgggag cgccctcttccgccacgagg tcggctacgt cctgggacag ctgcagcacg 840 aggcagagct ggggctctgggaagaggctg acgttatggt ggcttcagct tcactaggaa 900 tgggacacag ggtctgggggcctctgactc ccccaccccg aggcctgggt agggacaggg 960 tggtggtccc tgagtgggtcaggtagggc 989 45 3886 DNA Homo sapiens misc_feature Incyte ID No3509272CB1 45 gtgggcagaa aagctttgtt aacctccttt tacagatgag gaaaaacaagatcagaggtg 60 ctaagtgctg tagcctagtg ccaggtcttc tggccccaat tctgggttctccccaagccc 120 atgtttcttc ccctttctca caatctttac ttcttcctct gaccctcaccaccacccaaa 180 gtacttttaa ttctagaaaa gaaacccagc tgcacactgg cacacctgaccttcatgcag 240 tcagaagctt tggatgattc cccatccaaa atattagaga tgaaatgaaagcaaagtagg 300 catctgacaa aagttgcttt ttcccttctg cattttagga cctcaagtaatgtttatcca 360 gaaactgcta tcataccagg gattcattgt gtatttaaca acataggcatgcaatctggc 420 aaatttgaaa aactcttaac atacacccca aatccctgcc caaatttaagaactagggtg 480 gacacagtgc gtttttccat gtcgcatctt ctgtgatggg gctacgatacgtgggagcag 540 agaatgggga gggtggagcg catgccagat gaggatctat cagcaatgggacggggcctc 600 cactttagca tctccaccct gctcctctca gaggaccgcc tttcattgcattcagctgtg 660 atggtagcac gaacacaggt gcaccgagga cgaggagagc aggagccttgtgctctctct 720 gcatctgagg caggacagca cagggtacgg agcagtctgc agagaggccagctcatcagg 780 gaagcacttg tcttccacct tgggctttga ctgagcactg ggcaattggcctctggggat 840 caacgaaata atcctaaaca gagttactct atgtcacact atggaatgttccaagtaggt 900 ggccgtgttt tcaaaagatg tattttctcc ttttgttgtt gccatttcataggtttagga 960 ttgggtgtgt gtttctcctc tctgaatggc actcgaatgt ttgctgactcctactctgtg 1020 tgactggggt gtacagctat ggactgatgc atcccatccc atcatctttcatgatcaaag 1080 cagtctcttc ttttttgaca gctgaagaag catcggtagg gaatccagaaggagcgttca 1140 tgaaggtgtt acaagcccgg aagaactaca caagcactga gctgattgttgagccagagg 1200 agccctcaga cagcagtggc atcaacttgt caggctttgg gagtgagcagctagacacca 1260 atgacgagag tgattttatc agtacactaa gttacatctt gccttatttctcagcggtaa 1320 acctagatgt gaaatcactg ttactaccgt taattaaact gccaaccacaggaaacagcc 1380 tggcaaagat tcaaactgta ggccaaaacc ggcagagagt gaagagagtcctcatgggcc 1440 caaggagcat ccagaaaagg cacttcaaag aggtaggaag gcagagcatcaggagggaac 1500 agggtgccca ggcatctgtg gagaacgctg ccgaagaaaa aaggctcgggagtccagccc 1560 caagggaggt ggaacagccc cacacacagc aggggcctga gaagttagcgggaaacgccg 1620 tctacaccaa gccttccttc acccaagagc ataaggcagc agtctctgtgctgaaaccct 1680 tctccaaggg cgcgccttct acctccagcc ctgcaaaagc cctaccacaggtgagagaca 1740 gatggaaaga cttaacccac gctatttcca ttttagaaag tgcaaaggctagagttacaa 1800 atacgaagac gtctaaacca atcgtacatg ccagaaaaaa ataccgctttcacaaaactc 1860 gctcccacgt gacccacaga acacccaaag tcaaaaagag tccaaaggtcagaaagaaaa 1920 gttatctgag tagactgatg ctcgcaaaca ggcttccatt ctctgcagcgaagagcctca 1980 taaattcccc ttcacaaggg gctttttcat ccttaggaga cctgagtcctcaagaaaacc 2040 cttttctgga agtatctgct ccttcagaac attttataga aaagaataatacaaaacaca 2100 caactgcaag aaatgccttt gaagaaaatg attttatgga aaacactaacatgccagaag 2160 gaaccatctc tgaaaacaca aactacaatc atcctcctga ggcagattccgctgggactg 2220 cattcaactt agggccaact gttaaacaaa ctgagacaaa atgggaatacaacaacgtgg 2280 gcactgacct gtcccccgag cccaaaagct tcaattaccc attgctctcgtccccaggtg 2340 atcagtttga aattcagcta acccagcagc tacagtccct tatccccaacaacaatgtga 2400 gaaggctcat tgctcatgtt atccggacct tgaagatgga ctgctctggggcccatgtgc 2460 aagtgacctg tgccaagctc atctccagga caggccacct gatgaagcttctcagtgggc 2520 agcaggaagt aaaggcatcc aagatagaat gggatacgga ccaatggaagattgagaact 2580 acattaatga gagcacagaa gcccagagtg aacagaaaga gaagtcgcttgagctcaaaa 2640 aagaagttcc aggatatggc tatactgaca aactcatctt ggcattaattgttactggaa 2700 tactaacgat tttgattata cttttctgcc tcattgtgat atgttgtcaccgaaggtcat 2760 tacaagaaga tgaagaagga ttctcaaggg gcattttcag atttctgccacggaggggat 2820 gctcttcgcg aagggagagt caggatggac tttcctcatt tggacagccgctctggttta 2880 aagatatgta caaacctctc agtgccacaa gaataaataa tcatgcatggaagctgcaca 2940 agaagtcatc taatgaggac aagatcctca acagggaccc tggggacagcgaagccccaa 3000 cggaggagga ggagagtgaa gccctgccat aggaggagaa cacagcccacctcaggcctc 3060 ctgcaaaaat acatagaata aacaacaaca gttactaaat gaatgaaaattgtgattccg 3120 atgaagcctg ccagagaaaa aaagcatttt ttaaaagagg aaataaggtgatatctgatt 3180 agggcaaaca tgatgcagac aagaaatgca ccggttcaga ggagggaaggtcaggccgcc 3240 tggggagagt ccatgaaaaa gatggaacgt gccagatgct gtacctggtgctgggaaaga 3300 gttgactagg ccagcatccc tttcctcaaa gggggggctc ctagactggggggagggctg 3360 gacatctgaa tacatcctga ggagacagtg tgggacagca tggtggcagtggaaccagcc 3420 gtggttctgc tcttggtcgg ctggaaagga gtagatgtaa gggatggtttagaagaaggg 3480 aagtggaaga aaagttttct gagctgacaa gaggaaggaa aggccgcctagaaggacact 3540 aaaaaggcaa gagaagccct aagcagagtg agcaccagac tccacaggttaagggctcag 3600 tcacacagga ccatccgcat gtcagacccc aggtgcaagg ccaagcatcacctatgcatc 3660 tgaccaactg gctgtaaatt ggaggtcccc acaactccct cctcaggtttgaacatttgc 3720 tagaacagct catggaaccc aggaaaacag ttttcttact agtgctgatttattacaaag 3780 gatatttaaa aggacacaaa tgatgaagcc agttgaagag atacacagggtgaggttgga 3840 aggtcctgtg gagtggggtg accatctctg gacatggttg tcgcac 388646 2677 DNA Homo sapiens misc_feature Incyte ID No 708425CB1 46ttgatttctt tcgtgttgtg tttttgaagg tggaacctaa cttcttgtgg aacaagtgtt 60gccagctcag aaggcagtga ggagctgttt tcatctgtgt ctgttggaga tcaagatgat 120tgctattccc tgttagatga tcaggacttc acttcttttg atttatttcc tgaggggagt 180gtctgcagtg atgtctcttc ttctattagc acttactggg attggtcaga tagcgagttt 240gaatggcagt taccaggcag tgacattgcc agtgggagtg atgtactttc tgatgtcata 300cccagtattc caagttcacc ttgcctgctt cctaaaaaga aaaacaagca ccggaattta 360gatgaactcc cttggagtgc aatgacaaat gatgagcagg tggaatatat tgagtatctg 420agtcggaaag tgagtactga gatgggtctt cgggagcaac ttgatattat taagatcatt 480gatccttctg ctcaaatctc ccctacagac agtgagttta ttattgaact taactgtctc 540acagatgaaa aactgaagca ggtcagaaac tatatcaagg aacatagccc tcgccaacgg 600cctgcaagag aggcctggaa gagaagcaac tttagttgtg caagcaccag tggagtgagc 660ggtgccagtg ccagcgccag cagcagcagt gccagcatgg tcagttctgc aagcagcagt 720gggtccagtg ttggaaactc tgcttcaaac tccagtgcca acatgagtcg agcacacagt 780gacagcaacc tgtctgcaag tgcagcagag cggattcggg attcaaaaaa gcgatccaag 840cagcggaagt tacagcagaa ggccttccgc aagaggcagc tgaaggagca gaggcaggcc 900cggaaggaga ggctcagtgg gctcttcctt aacgaagagg tgctgtcctt gaaagtgact 960gaggaagacc atgaagcaga tgttgatgtt ttgatgtaat aagggtgaat ttatcaacgt 1020tctttgtgag cattaaaata ctccatcctt atgggtttac atgcatcttg accaaaattt 1080tggttagggc tgtgctgatg taccattaat aatttaagcc agtggttctt ggcaggtgat 1140cccaagacct gcagcttcag catcacttga gaagttgtta ggaatgcata ctagtgggcc 1200ccgcccccag acatagtgaa tcagaaacca acagggaggc gcctagcatt gtttttttaa 1260caagtgctgg gttattctga tgcacagtct agtttaagaa ccactacttt gggtaaacgt 1320tttgactgtt taaagtttat ggcggtgaag tgggcgtctt caaagactag tacttacaca 1380gtttagaaga tttcaaggta ctgctgacag tagtttatta tgtcagtata catacatgtg 1440tagagatcat aatttagttc ccttcttaat gttacaattt cttagtttac ttttcctaaa 1500gggccatagc ataattcttg attcctggtg gaaatctttt ctgaggtgtg ggggtgggca 1560aggtgtggat tgctgtttac gatagtgcct tcattagttt tagttctgtc tgttttcatt 1620cattattgac tcaaaggtat tagaacaggc ccttatcttt ttcctattag atttattttt 1680gttttttact ttatgtaagt tcagaatcct tttttaagtg atgactactg atgaaataat 1740gttactagta gctgaatttt agacttgatg ctatgttgat taatatttaa atggtgaaag 1800taattaggca aaataagcaa ttctgctttt ctgttgcttt cagaattttt aggctgtaaa 1860attgtctcag gcaggaagtc agtctatatc tctctcaagt tcttattgta ggtcattttt 1920ttttcagcac agctctaatt tttaaatatt tggtataaca gcttctggca tgtgacctag 1980ttaatgttga aaattaactg atgaaaaccc ttgtagatct gctaaacttc tctataaatt 2040gctagaaata tacagagcaa tgagatatag aaaacacctt agcttgttgg ttattaatag 2100agctctaatt ttaactgtga aatcatggag tttctttaaa tgttttccta gagtgtttgt 2160tatcatgaac attagaaata gcattattcc ttttcttcta ccacatgtct tgaacattca 2220catgggttaa agaagatgga aactgatact actgacaatc aagctgcttt ctgactttca 2280taattttctt tatattgtgg atagcatttt gaaggtgacc gtgattctgt aggaagagct 2340gtaatcaata tttttcaaac tctaagagta cagaggttag taagatttct ttttttttta 2400ctttttgtta cgtcaagtag cttattaaga aaaagattca tgacttaaat tcttaattct 2460cagttgatac aaaggcttat aatcatgtgt tttaaaattt tgatttcatt ttcgttcctt 2520gaaggttaaa gaatggaaac tggggataaa aattaataca ctgcagcttt gccaaggtct 2580gtgatatggt tgtaaccttt taaagactac taaagtcata atttgaaatt tttactaaaa 2640taaaacatat gtgtctatgg ttttcaaaaa aaaaaaa 2677 47 3031 DNA Homo sapiensmisc_feature Incyte ID No 7469966CB1 47 agaaatcaaa agaagaacac tctttatcgaggtggcaaag tatcctggtg gaacttctat 60 ttttcttctg aaaactattc ttcagatttgttaaaaggga cagttatgaa aagcttaaag 120 atttaataca ctgctgggca gagtctcctaaaccaatatt tgcaaaaatc atcaatcttt 180 atcatcatcc aggctgtgga ggtaccacactggctatgca tgttctctgg gacttaaaga 240 aaaacttcag atgtgctgtg ttaaaaaacaagacaactga ttttgcagaa attgcagagc 300 aagtgatcaa tctggtcacc tatagggcaaagagccatca ggattacatt cctgtgcttc 360 tccttgtgga tgattttgaa gaacaagaaaatgtctactt tctacaaaat gccatccatt 420 ccgttttagc agaaaaggat ttgcgatatgaaaaaacatt ggtaattatc ttaaactgca 480 tgagatcccg gaatccagat gaaagtgcaaaattggcaga cagtattgca ctaaattacc 540 aactttcttc caaggaacaa agagcctttggtgccaaact gaaggaaatt gaaaagcagc 600 acaagaactg tgaaaacttt tattccttcatgatcatgaa aagcaatttt gatgaaacat 660 atatagaaaa tgtagtcagg aatatcctaaaaggacagga tgttgacagc aaggaagcac 720 aactcatttc cttcctggct ttactcagctcttatgttac tgactctaca atttcagttt 780 cacagtgtga aatatttttg ggaatcatatacactagtac accctgggaa cctgaaagct 840 tagaagacaa gatgggaact tattctacacttctaataaa aacagaagtt gcagaatatg 900 ggagatacac aggtgtgcgt atcattcaccctctgattgc cctgtactgt ctaaaagaac 960 tggaaagaag ctatcacttg gataaatgtcaaattgcatt gaatatatta gaagagaatt 1020 tattctatga ttctggaata ggaagagacaaatttcaaca tgatgttcaa actcttctgc 1080 ttacaagaca gcgcaaggtg tatggagatgaaacagacac tctgttttcc ccattaatgg 1140 aagctttaca gaataaagac attgaaaaggtcttgagtgc aggaagtaga cgattcccac 1200 aaaatgcatt catttgtcaa gccttagcaagacatttcta cattaaagag aaggacttta 1260 acacagctct ggactgggca cgtcaggccaaaatgaaagc acctaaaaat tcctatattt 1320 cagatacact aggtcaagtc tacaaaagtgaaatcaaatg gtggttggat gggaacaaaa 1380 actgtaggag cattactgtt aatgacctaacacatctcct agaagctgcg gaaaaagcct 1440 caagagcttt caaagaatcc caaaggcaaactgatagtaa aaactatgaa accgagaact 1500 ggtcaccaca gaagtcccag agacgatatgacatgtataa cacagcttgt ttcttgggtg 1560 aaatagaagt tggtctttac actatccagattcttcagct cactcccttt ttccacaaag 1620 aaaatgaatt atccaaaaaa catatggtgcaatttttatc aggaaagtgg accattcctc 1680 ctgatcccag aaatgaatgt tatttggctcttagcaagtt cacatcccac ctaaaaaatt 1740 tacaatcaga tctgaaaagg tgctttgacttttttattga ttatatggtt cttctgaaaa 1800 tgaggtatac ccaaaaagaa attgcagaaatcatgttaag caagaaagtc agtcgttgtt 1860 tcaggaaata cacagaactt ttctgtcatttggatccatg tctattacaa agtaaagaga 1920 gtcaattact ccaggaggag aattgcaggaaaaagctaga agctctgaga gcagataggt 1980 ttgctggact cttggaatat cttaatccaaactacaaaga tgctaccacc atggaaagta 2040 tagtgaatga atatgccttc ctactgcagcaaaactcaaa aaagcccatg acaaatgaga 2100 aacaaaattc cattttggcc aacattattctgagttgtct aaagcccaac tccaagttaa 2160 ttcaaccact taccacgcta aaaaaacaactccgagaggt cttgcaattt gtaggactaa 2220 gtcatcaata tccaggtcct tatttcttggcctgcctcct gttctggcca gaaaatcaag 2280 agctagatca agattccaaa ctaatagaaaagtatgtttc atccttaaat agatccttca 2340 ggggacagta caagcgcatg tgcaggtccaagcaggcaag cacacttttc tatctgggca 2400 aaaggaaggg tctaaacagt attgttcacaaggccaaaat agagcagtac tttgataaag 2460 cacaaaatac aaattccctc tggcacagtggggatgtgtg gaaaaaaaat gaagtcaaag 2520 acctcctgcg tcgtctaact ggtcaggctgaaggcaagct aatctctgta gaatatggaa 2580 cagaggaaaa aataaaaata ccagtaatatctgtttattc aggtccactc agaagtggta 2640 ggaacataga aagagtgtct ttctacctaggattttccat tgaaggccct ctggcatatg 2700 atatagaagt aatttaagac aatacatcacctgtagttca aatatgttta tttatatctt 2760 tatgatttta ttctctctct ctattctcatggcactttca taacattatg gctaacctct 2820 aattacagat tttgcttttg cctccctgaatgaattacaa gcctttttaa gatatgaaat 2880 atgcctaccc gcagagcttg gcacaaagtggagtcaatct tttaatgttt taaatatgca 2940 ttttcagact caaataatta agaagtttcattgatatcca ctggtcacat cataactgtc 3000 tatagggcaa taaaatctgt gttaaactca a3031 48 2415 DNA Homo sapiens misc_feature Incyte ID No 6920129CB1 48cctggcacca ttttggtcgg acagttttgt cttgcagggg gctgtcctgt gcattataga 60atgtttagca gcatccgtgg cctctaccca ctagatgcca gtagcacctc tcccttgagt 120tgtgacaatc aaaaacatct ccattcattg ccaaatccca ctccccccgc cacagacaca 180gttccctggt tgagacccat tggtttaaat aagtatgtgt tttctaagat gaactggaac 240tgcatctact tggaatggtt tggaatttct caagatattt tgctcgagtg tgatacagaa 300tttagaattt ttttttaatc tctttctgtg ttgctatacg cagccttaaa acgttcttga 360gttaattaga tgagccaaag agatggtgtc tgtgggtcgc atgaagtggc tggtgcagcc 420tcccctggtg ctgatggcgg gctctctttg gcagcgtact gtaagaactc tgtggacggc 480ctctggtact gcttcgatga cagcgatgtg cagcagctgt cagaagatga ggtctgcacg 540cagacagcat acatcctctt ctaccagagg cggacagcca tcccgtcatg gtcagccaac 600agctcggtgg caggctccac aagttcttcc ctgtgtgaac actgggtgag ccggctcccg 660ggcagcaagc cagccagcgt gacctctgca gcttcctcca gacgcacctc cctggcgtcg 720ctctctgagt ccgtggagat gactggagaa aggagtgaag atgatggagg cttttcaact 780cgaccatttg tgagaagtgt ccagcgtcag agtttgtcat ccagatcttc tgtcaccagc 840cccttggccg tcaatgaaaa ttgcatgaga ccttcatggt ccctgtctgc taagctgcag 900atgcgctcca attctccatc ccgattttca ggggattcgc caattcacag ctctgcttcc 960accttggaga agattgggga ggcagcagat gacaaggtct ccatctcttg ctttggtagc 1020ttgcggaacc tttctagcag ttaccaggaa ccaagcgaca gtcatagtct ccgtgagcac 1080aaggctgtgg gccgggcccc tctggctgtc atggaaggcg tgttcaaaga cgaatcggac 1140acccgcagat tgaactccag tgtcgtagat acacagagca aacattcagc acaaggggac 1200cgcctgcccc cgctctctgg tccatttgat aacaataatc agatcgctta tgtggatcag 1260agcgactccg tagacagctc tccagtcaaa gaggtgaaag cccccagcca cccaggctca 1320ctcgcaaaga aaccagagag cacaactaag agatccccca gttccaaagg cacttctgag 1380ccagagaaaa gcttgcggaa ggggagacca gccttggcaa gccaggagtc atccctttca 1440agtacatccc cttcttctcc tcttcctgta aaagtctctc taaagccctc ccgctcccgc 1500agcaaagcag attcttcttc caggggcagt ggacggcatt catcccctgc ccctgcccaa 1560cccaaaaagg agtcatcccc gaaatctcag gactccgtgt cgtctccttc gccacagaag 1620cagaagtcag cctcggccct cacctacact gcttcctcca catctgccaa aaaggcctcg 1680ggccctgcca caaggagccc tttcccacct gggaagagca ggacttcaga ccacagcttg 1740agtagagagg gctccagaca aagcttgggt tctgacagag ccagcgccac ctccacctcc 1800aaacccaatt cccctcgggt gagccaggcc cgagcagggg agggcagggg ggccgggaag 1860cacgtgcgga gctcctccat ggccagcctg cgctccccca gcacaagcat caagtctggt 1920ttgaagaggg acagcaagtc tgaggacaag gggctgtcct tcttcaaatc agccttgaga 1980cagaaggaaa cccggcgctc gacggatctt ggcaagacag ccttgctctc taaaaaggct 2040ggtgggagct ctgttaagtc tgtctgtaag aacaccgggg acgacgaggc agagagaggc 2100caccagcctc cagcttccca gcagccaaat gcaaatacaa cgggaaaaga gcagcttgtc 2160accaaggacc ctgcttctgc caaacattcc ctgctgtccg ctcgcaaatc caagtcttcc 2220caactagact ctggagttcc ctcgtctccg ggtggcaggc agtctgcaga gaaatcctca 2280aaaaagttat cttctagcat gcaaacctct gcacggcctt ctcaaaaacc tcagtgatat 2340ttctgcaatc gaagtgtttt atctgtaaag atgtttattt atttagaacc cctgccctcc 2400caaaaaaaaa aaaaa 2415 49 1843 DNA Homo sapiens misc_feature Incyte ID No71514704CB1 49 gtgaaacaca ttcatcttca tcagaaagag cctcatcttt aattgcataaaacagaaagg 60 ttggctcgca cttccctcgg ttggtgactc cgggcgcgtc gaagatcttgtacctgtcgt 120 tcgcagtggc tcactttccc aagccatggg cagcaggaag aaggaaattgccctgcaggt 180 caacatcagc acccaagagc tttgggagga aatgctcagt tccaaaggactaactgttgt 240 tgatgtctat caaggctggt gtggcccctg caaacctgtg gtgagcctcttccagaagat 300 gaggatcgag gtcggcctgg accttctgca ctttgcatta gcagaggcagatcgtcttga 360 tgtcctcgaa aagtacagag ggaagtgcga gccaaccttt ctgttttatgcaggaggaga 420 actggtggct gtggttagag gagcaaatgc cccactgctg cagaaaaccatcctagacca 480 gctggaggcc gaaaagaaag tgctggctga aggcagagaa cggaaagtgattaaagatga 540 ggctctttct gatgaagatg aatgtgtttc ccatggaaag aataatggtgaagatgagga 600 catggtttca tcagagagga cctgtacctt ggccatcatt aaaccagatgcagtggccca 660 tggaaagact gatgagatta tcatgaagat tcaggaagct gggtttgaaattctaacaaa 720 tgaagagaga accatgacag aggcagaagt gcgacttttc taccaacacaaagctggaga 780 ggaggcattt gagaagctgg tacatcacat gtgcagtgga ccaagccacctcctgatcct 840 caccaggact gagggcttcg aggacgtggt cactacctgg cgaaccgtcatgggcccccg 900 tgaccccaat gtggccagga gggagcagcc agaaagtctc cgagctcagtacggcacaga 960 aatgcccttc aatgccgtcc atggaagccg ggacagagaa gatgctgacagagaactggc 1020 attgctcttc cccagtttga aattttcaga caaagataca gaagcccctcagggcggtga 1080 ggctgaagca acagcggggc ccactgaggc gctttgcttt cctgaggatgtggattgaga 1140 tggctgtgct gctcttccag agcacgtgac caggggtcta ctgcacaaaacagacctccg 1200 gaatcggaac ttactctttt gagtaccaat actttttggt ttagctgtcttttgtagaca 1260 atgaaaataa tatggctggg cgcagtgact cacacctgta atcccagcactttgggaggc 1320 cgaggcaggt ggatcacctg agattaggag tttgacacca gcctagccaacatggtgaaa 1380 ccctgtctct actaaaaata caaaaaatta gccgggcgtg gtggtgtgcacctgtcatcc 1440 agctactcgg gtggcttagg catgagaatc gcttgaaccc gggaggtgaaggttgcagta 1500 agccgagatc atgccactgc actccagcct gggtgacaga gtgagactccatcttaataa 1560 taataataac aaaaaaaagt aacagaaaca acaaaacaac aagggcgcggcccgcgaatt 1620 agttgacgct tcgttagacc cgggaaatta atcccggaac cggttcccttgcagggggct 1680 ctgcaagaat ttgcaatatc aagctttatt ggaatcccgg tcaaacctcgaagggagggc 1740 cccgggtacc caaattcgcc cttatagtga gtgatattac gcagcgcataagtggccgcc 1800 gttttacaag gtgtgcttgg aaaccccggg tttccaatta tgc 1843 501827 DNA Homo sapiens misc_feature Incyte ID No 7715945CB1 50 gagtcggcggcggtggcgga ggcggtgagt gcgcggctcc ggggctggcc gactccgcta 60 gtggcccggccggcctgtgc tcgggggctc cgggctctgg gctctgggtg cgcggaccgg 120 gccaggctgcttgaagacct cgcgacctgt gtcagcagag ccgccctgca ccaccatgtg 180 catcatcttctttaagtttg atcctcgccc tgtttccaaa aacgcgtaca ggctcatctt 240 ggcagccaacagggatgaat tctacagccg accctccaag ttagctgact tctgggggaa 300 caacaacgagatcctcagtg ggctggacat ggaggaaggc aaggaaggag gcacatggct 360 gggcatcagcacacgtggca agctggcagc actcaccaac tacctgcagc cgcagctgga 420 ctggcaggcccgagggcgag gtgaacttgt cacccacttt ctgaccactg acgtggacag 480 cttgtcctacctgaagaagg tctctatgga gggccatctg tacaatggct tcaacctcat 540 agcagccgacctgagcacag caaagggaga cgtcatttgc tactatggga accgagggga 600 gcctgatcctatcgttttga cgccaggcac ctacgggctg agcaacgcgc tgctggagac 660 tccctggaggaagctgtgct ttgggaagca gctcttcctg gaggctgtgg aacggagcca 720 ggcgctgcccaaggatgtgc tcatcgccag cctcctggat gtgctcaaca atgaagaggc 780 gcagctgccagacccggcca tcgaggacca gggtggggag tacgtgcagc ccatgctgag 840 caagtacgcggctgtgtgcg tgcgctgccc tggctacggc accagaacca acactatcat 900 cctggtagatgcggacggcc acgtgacctt cactgagcgt agcatgatgg acaaggacct 960 ctcccactgggagaccagaa cctatgagtt cacactgcag agctaacccc acctctgggc 1020 ctggccagtgggctcctggg gggccctgcc ttgaggggca ctgtggacag gaaaccttcc 1080 tttgccatactgcattgcac tgcccgtggc ttggccagca tcccccggat cagggccctg 1140 tggtttgcgtgttacccatc tgtgtcccca tgcccagttc agggtctgcc tttatgccag 1200 tgaggagcagcagagtctga tactaggtct aggaccggcc gaggtatacc atgaacatgt 1260 gggtacacctgagcccactc ttgcacatgt acacaggcac tcacatggca cacacataca 1320 ctcctgcgtgtgcacaagca cacacatgca aggcatatac atggacaccg acacaggcac 1380 atgtacatgcacaggtgtgc tacacatgtg cacacatgca cagttgcaca gacacacaca 1440 cacaggtgcacacacacgat gccgaacaag gcagaagggc gactctcacc tctcatgtgc 1500 ttctggccagtaggtctttg ttctggtcca acgacaggag taggcttgta tttaaaagcg 1560 gcccctcctctcctgtggcc acagaacaca ggcgtgcttg gactcttgac aagcagacct 1620 gctcctgcagaggagacagc cacatttgga attgggcacc gagaagacct gagaaaaacc 1680 cactctctcttttttttttt ttttgagacg gagtcttgct ctgtcaccca ggctggagtg 1740 cagtggcacgatctcggctc actgcaacct ccgcctccca agttcaagca attctcctgc 1800 cccagcctcctgattagctg ggattac 1827 51 794 DNA Homo sapiens misc_feature Incyte IDNo 7025368CB1 51 gcccaggaga gctcgaccca cccaggcaca ccatagcccc agagatggctggggacacag 60 aagtgtggaa gcaaatgttt caggagttaa tgcgggaggt gaagccatggcacaggtgga 120 ccctgagacc agacaagggc cttcttccca acgtcctgaa gccaggctggatgcaatacc 180 agcagtggac cttcgccagg ttccagtgct cctcctgctc tcgtaactgggcctctgccc 240 aagttctggt ccttttccac atgaactgga gtgaggagaa gtccaggggccaggtgaaga 300 tgagggtgtt tacccagaga tgtaagaagt gcccccaacc tctgtttgaggaccctgagt 360 tcacacaaga gaacatctca aggatcctga aaaacctggt gttccgaattctgaagaaat 420 gctatagagg aagatttcag ttgatagagg aggttcctat gatcaaggacatctctcttg 480 aagggccaca caatagtgac aactgtgagg catgtctgca gggcttctgtgctgggccca 540 tacaggttac aagcctcccc ccatctcaga ccccaagagt acactccatttacaaggtgg 600 aggaggtagt taagccctgg gcctcaggag agaatgtcta ttcctacgcatgccaaaacc 660 acatctgtag gaacttaagc attttctgct gttgtgtcat tctcattgttatcgtggtga 720 ttgttgtaaa aactgctata tgagccttgg aaacatgacg ctaagtcagtaataaaatca 780 gattcaaaaa gtca 794 52 1539 DNA Homo sapiens misc_featureIncyte ID No 7500954CB1 52 tttcacataa ctagaaggtg atagacagta tgacagtggaatacctcttg ttctcattta 60 ggacagtgtc tgaattgtag tatgagagac ttaagttagataatttagtg catttcctag 120 tagtgagagt tgttaattgt taacacttgg aaggaattggtatcataagg aattgacatc 180 ataagttttg gaaatctttt tctgctagga tgctttatagcctatctgaa tttgaaagtg 240 tagtgtagat taactgtttt ttagccatat tattctgtggtattccattt aggtattgtg 300 aattattcag aatgcagacc ccacgatata gaggtacaaattatattcct tccaaactaa 360 tacacaaact atattatgtg cttttctttg taaaacagtttgttagggtt gtgctatttg 420 aacatatttt ctggtggata ttctcagctt tgaacactgtacatttactg gtaagatttt 480 atttataaat tacacatttt aaggaaattt ctatacttctaaaactcgaa atattttttg 540 aaagaggtaa ttgctatata tctgctttcc ttctaaaatatttcaaatat atctggtggc 600 tactgtttta tgtatggaga aggtaaggca ttgagattaactttctcata cttacaaaat 660 gggctgtgtg tggtgctgaa aataagaccg gtgtcctgactcccagtttg ttttctgtgt 720 attagaccag actgcttact taaagtttta ttgcgctaaaaatctgttca taagtttgag 780 ctccattttt ttctgtccat tattacaaca tagagaagcacattcatatc cccagagaaa 840 tttagtgata tgcagcaact ttgtttcact cttgacattaagtgtacatt gttaacattt 900 tttatcttat gattgaatgt ttcaggtttt cctcctccaccaggcgctcc acctccatct 960 cttataccaa caatagaaag tggacattcc tctggttatgatagtcgttc tgcacgtgca 1020 tttccatatg gcaatgcgat gaagaacgat acagatacagggaatatgca gaaagaggtt 1080 atgagcgtca cagagcaagt cgagaaaaag aagaacgacatagagaaaga cgacacaggg 1140 agaaagagga aaccagacat aagtcttctc gaagtaatagtagacgtcgc catgaaagtg 1200 aagaaggaga tagtcacagg agacacaaac acaaaaaatctaaaagaagc aaagaaggaa 1260 aagaagcggg cagtgagcct gcccctgaac aggagagcaccgaagctaca cctgcagaat 1320 aggcatggtt ttggcctttt gtgtatatta gtaccagaagtagatactat aaatcttgtt 1380 atttttctgg ataatgttta agaaatttac cttaaatcttgttctgtttg ttagtatgaa 1440 aagttaactt tttttccaaa ataaaagagt gaatttttcatgttaagtta aaaatctttg 1500 tcttgtacta tttcaaaaat aaaaagacag caatgactt1539

What is claimed is:
 1. An isolated polypeptide selected from the groupconsisting of: a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO:1-26, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO:1-26, c) a biologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO:1-26, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ IDNO:1-26.
 2. An isolated polypeptide of claim 1 comprising an amino acidsequence selected from the group consisting of SEQ ID NO:1-26.
 3. Anisolated polynucleotide encoding a polypeptide of claim
 1. 4. Anisolated polynucleotide encoding a polypeptide of claim
 2. 5. Anisolated polynucleotide of claim 4 comprising a polynucleotide sequenceselected from the group consisting of SEQ ID NO:27-52.
 6. A recombinantpolynucleotide comprising a promoter sequence operably linked to apolynucleotide of claim
 3. 7. A cell transformed with a recombinantpolynucleotide of claim
 6. 9. A method of producing a polypeptide ofclaim 1, the method comprising: a) culturing a cell under conditionssuitable for expression of the polypeptide, wherein said cell istransformed with a recombinant polynucleotide, and said recombinantpolynucleotide comprises a promoter sequence operably linked to apolynucleotide encoding the polypeptide of claim 1, and b) recoveringthe polypeptide so expressed.
 10. A method of claim 9, wherein thepolypeptide comprises an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-26.
 11. An isolated antibody whichspecifically binds to a polypeptide of claim
 1. 12. An isolatedpolynucleotide selected from the group consisting of: a) apolynucleotide comprising a polynucleotide sequence selected from thegroup consisting of SEQ ID NO:27-52, b) a polynucleotide comprising anaturally occurring polynucleotide sequence at least 90% identical to apolynucleotide sequence selected from the group consisting of SEQ IDNO:27-52, c) a polynucleotide complementary to a polynucleotide of a),d) a polynucleotide complementary to a polynucleotide of b), and e) anRNA equivalent of a)-d).
 14. A method of detecting a targetpolynucleotide in a sample, said target polynucleotide having a sequenceof a polynucleotide of claim 12, the method comprising: a) hybridizingthe sample with a probe comprising at least 20 contiguous nucleotidescomprising a sequence complementary to said target polynucleotide in thesample, and which probe specifically hybridizes to said targetpolynucleotide, under conditions whereby a hybridization complex isformed between said probe and said target polynucleotide or fragmentsthereof, and b) detecting the presence or absence of said hybridizationcomplex, and, optionally, if present, the amount thereof.
 16. A methodof detecting a target polynucleotide in a sample, said targetpolynucleotide having a sequence of a polynucleotide of claim 12, themethod comprising: a) amplifying said target polynucleotide or fragmentthereof using polymerase chain reaction amplification, and b) detectingthe presence or absence of said amplified target polynucleotide orfragment thereof, and, optionally, if present, the amount thereof.
 17. Acomposition comprising a polypeptide of claim 1 and a pharmaceuticallyacceptable excipient.
 18. A composition of claim 17, wherein thepolypeptide comprises an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-26.
 20. A method of screening a compound foreffectiveness as an agonist of a polypeptide of claim 1, the methodcomprising: a) exposing a sample comprising a polypeptide of claim 1 toa compound, and b) detecting agonist activity in the sample.
 23. Amethod of screening a compound for effectiveness as an antagonist of apolypeptide of claim 1, the method comprising: a) exposing a samplecomprising a polypeptide of claim 1 to a compound, and b) detectingantagonist activity in the sample.
 26. A method of screening for acompound that specifically binds to the polypeptide of claim 1, themethod comprising: a) combining the polypeptide of claim 1 with at leastone test compound under suitable conditions, and b) detecting binding ofthe polypeptide of claim 1 to the test compound, thereby identifying acompound that specifically binds to the polypeptide of claim
 1. 28. Amethod of screening a compound for effectiveness in altering expressionof a target polynucleotide, wherein said target polynucleotide comprisesa sequence of claim 5, the method comprising: a) exposing a samplecomprising the target polynucleotide to a compound, under conditionssuitable for the expression of the target polynucleotide, b) detectingaltered expression of the target polynucleotide, and c) comparing theexpression of the target polynucleotide in the presence of varyingamounts of the compound and in the absence of the compound.
 29. A methodof assessing toxicity of a test compound, the method comprising: a)treating a biological sample containing nucleic acids with the testcompound, b) hybridizing the nucleic acids of the treated biologicalsample with a probe comprising at least 20 contiguous nucleotides of apolynucleotide of claim 12 under conditions whereby a specifichybridization complex is formed between said probe and a targetpolynucleotide in the biological sample, said target polynucleotidecomprising a polynucleotide sequence of a polynucleotide of claim 12 orfragment thereof, c) quantifying the amount of hybridization complex,and d) comparing the amount of hybridization complex in the treatedbiological sample with the amount of hybridization complex in anuntreated biological sample, wherein a difference in the amount ofhybridization complex in the treated biological sample is indicative oftoxicity of the test compound.